
aming only a basic understanding of multiple regression analysis, this 
cessible introduction to time-series analysis shows how to develop mod- 
ble of forecasting, interpreting, and testing hypotheses concerning eco- 
lata using modem techniques. 



Modern Techniques for 
Modern Time-Series Analysis! 



| I Walter 
Enders 



5173 



ISBN 



471 



v edition reflects recent advances in time-series econometrics, such as 
imple forecasting techniques, nonlinear time-series models, Monte 
lalysis, and bootstrapping. Numerous examples from fields ranging from 
iral economics to transnational terrorism illustrate the techniques. 



Sale 



sled examples using real-world data illustrate key concepts. 

nts a straightforward, step-by-step approach to time-series estimation. 

•ge number of questions and empirical exercises enable you to practice 
ichniques covered in the text. 

sets are available on the text’s Web site. 

lasizes difference equations as the foundation of all time-series models. 



is the Lee Bidgood Chair of Economics at the University of 
ia. He received his doctorate in economics from Columbia University in 
irk. His research focuses on time-series econometrics with a special 
|is on the dynamic aspects of terrorism. He has published over fifty arti- 
iluding those in the American Economic Review, the American Political 
-'.eview, and the Journal of Business and Economics Statistics. 



Applied Econometric Time Serie 



Applied Econometric 
Time Series 



Second Edition 



WILEY 

Edition 



RESTRICTED! 

NOT FOR SALE IN 
NORTH AMERICA 




Walter Enders 







SECOND EDITION 



APPLIED ECONOMETRIC 
TIME SERIES 



Walter Enders 

University of Alabama 



©WILEY 



www.wiley.com/college/enders 



WILEY SERIES IN PROBABILITY 
AND STATISTICS 



Established by 

Walter A. Shewhart and 
Samuel S. Wilks 



Editors 

David J. Balding, Peter Bloomfield, Noel A. C. Cressie, 
Nicholas I. Fisher, lain M. Johnstone, J. B. Kadane, Louise M. 
Ryan, David W. Scott, Adrian F. M. Smith, Jozef L.Teugels 

Editors Emercti 

Vic Barnett, J. Stuart Blunter, David G. Kendall 



A complete list of the titles in this series appears at the end of this volume. 




Executive Editor Leslie Kraham 
Editorial Assistant Jessica Bartelt 
Marketing Manager Charity Robey 
Managing Editor Lari Bishop 
Associate Production Manager Kelly Tavares 
Production Editor Sarah Wolfnan-Robichaud 
Illustration Editor Jennifer Fisher 

This book was set in Times by Leyh Publishing LLC and printed and bound by Hamilton 
Printing. The cover was printed by Phoenix Color Corp. 



This book is printed on acid free paper, oo 



Copyright © 2004 by John Wiley & Sons, Inc. All rights reserved. 

No part of this publication may be reproduced, stored in a retrieval system or transmitted 
in any form or by any means, electronic, mechanical, photocopying, recording, scanning 
or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States 
Copyright Act, without either the prior written permission of the Publisher, or 
authorization through payment of the appropriate per-copy fee to the Copyright 
Clearance Center, 222 Rosewood Drive, Danvers, MA 01922, (978) 750-8400, lax 
(978) 646-8600. Requests to the Publisher for permission should be addressed to the 
Permissions Department, John Wiley &i Sons, Inc., 1 1 1 River Street, Hoboken, NJ 07030, 
(201) 748-6011, fax (201) 748-6008. 

To order books or for customer service please, call 1 (800)-CALL-WILEY (225-5945). 

Library of Congress Cataloging in Publication Data: 

Enders, Walter, 1948- 

Applied econometric time series / Walter Enders.— 2nd ed. 
p. cm. 

Includes bibliographical references and index. 

ISBN 0-471-23065-0 (cloth) 

1. Econometrics. 2. Time-series analysis. I. Title 

HB139.E55 2003 
330'0 1 ’5 1 9232 — 1 

USA ISBN: 0-471-23065-0 

WIE ISBN: 0-471-45173-8 

Printed in the United States of Amefica 

10 9 8 7 6 5 4 3 2 





I 



To Linda and to Lola s independence 1 

i 

I 

5 





PREFACE IX 



PREFACE 



In revising the text, I have tried to be careful about the trade-off between being complete 
and being concise. Textbook bloat has ruined the second editions of formerly fine manu- 
scripts. No one wants to read an encyclopedic treatment of a topic or technique that has 
fallen out of style. The Internet now provides virtually unlimited access to papers on any 
number of specialized topics. As such, I tried to let readers of the first edition be my guide. 

Most of the e-mail messages I received addressed a few key issues. People 
wanted to know how to compare the out-of-sample forecasts of alternative time-series 
models. Toward this end, the latter part of Chapter 2 has been restructured and the 
Granger-Newbold (1976) and Diebold-Mariano (1995) tests for comparing out-of- 
sample forecasts are discussed in a fair amount of detail. Chapter 3 contains a num- 
ber of new developments in ARCH modeling including a discussion of IGARCH, 
EGARCH, and threshold-GARCH (TGARCH) models. Forecasting the conditional 
variance is also emphasized. Many readers of the first edition were concerned about 
multiple cointegrating vectors and inference in regressions with mixtures of station- 
ary and nonstationary variables. Multiple cointegrating vectors and tests for cointe- 
gration with mixtures of /(I) and 1(2) variables are examined in Chapter 6. I use the 
results of Sims, Stock, and Watson (1990) to discuss issues such as selecting the 
appropriate lag length in unit root tests, vector auloregressions, and cointegration 
tests. The issue also arises quite naturally in developing the general-to-specific mod- 
eling strategy (see Chapter 6) for nonstationary variables. The largest change is the 
addition of an entire chapter on nonlinear time-series models. Not only have many 
articles on nonlinear time-series models appeared in the top general journals, there 
are entire journals focusing on nonlinear economic and econometric models. 

My original intent was to write a text in time-series macroeconometrics. Fortunately, 
a number of my colleagues convinced me to broaden the focus. Applied microecono- 
mists have embraced time-series methods, and the political science journals have 
become more quantitative. As in the first edition, examples are drawn from macroeco- 
nomics, agricultural economics, international finance, and my work with Todd Sandler 
on the study of transnational terrorism. You should find the examples in the text to pro- 
vide a reasonable balance between macroeconomic and microeconomic applications. 

The text is intended for those with some background in multiple regression 
analysis. 1 presume the reader understands the assumptions underlying the use of 
ordinary least squares. All of my students are familiar with the concepts correlation 
and covariation; they also know how to use t-tests and A-tests in a regression frame- 
work. I use terms such as mean square error, significance level, and unbiased esti- 
mate without explaining their meaning. Two chapters of the text examine multiple 
time-series techniques. To work through these chapters, it is necessary to know how 
to solve a system of equations using matrix algebra. Chapter 1, entitled Difference 
Equations, is the cornerstone of the text. In my experience, this material and a knowl- 
edge of regression analysis is sufficient to bring students to the point where they are 
able to read the professional journals and to embark on a serious applied study, 
viii 



I take the term applied that appears in the title very seriously. Towards this end, 

I believe in teaching by induction. The method is to take a simple example and build 
towards more general and more complicated models. Detailed examples of each pro- 
cedure are provided. Each concludes with a step-by-step summary of the stages typi- 
cally employed in using that procedure. The approach is one of learning by doing. A 
large number of solved problems are included in the body of each chapter. The 
Questions and Exercises section at the end of each chapter are especially important. 
You are encouraged to work through as many of the examples and exercises as pos- 
sible. An Instructors ' Manual is available to those adopting the text for their class. 

Some of the techniques illustrated in the text need to be explicitly programmed. 
Structural VARs need to be estimated using a package that has the capacity to manip- 
ulate matrices. Nonlinear models need to be estimated using a package that can per- 
form nonlinear least squares and maximum likelihood estimation. Completely 
menu-driven software packages are not able to estimate every form of time-series 
model. As I tell my students, by the time a procedure appears on the menu of an 
econometric software package, it's not new. As such, to work through all of the exam- 
ples and exercises, it is necessary to have access to a software package such as 
EVIEWS, M1CROFIT, PC-G1VE, RATS, SAS, SHAZAM, or STATA. 

Matrix packages such as MATLAB and GAUSS are not as convenient for uni- 
variate models. All of these packages have their own programming language. To 
assist you in your programming, I have written a Programming Guide to accompany 
this text. You can download the guide (at no charge) from the Wiley Web site or from 
my personal Web page: www.cba.ua.edu/~wenders. ESTIMA has an extended version 
of the guide for RATS users available at www.estirna.com. Of course, it is impossible 
for me to have versions of the guide for every possible platform. Most programmers 
should be able to transcribe a program written in one language into the language used 
by their personal software package. 

In spite of all my efforts, some errors have undoubtedly crept into the text. If the first 
edition is any guide, the number is embarrassingly large. 1 will keep a list of typos and 
corrections on my Web page: www.cba.ua.edu/~wenders. Moreover, time-series methods 
and techniques keep evolving very rapidly. I will try to keep you updated by posting 
research notes and clarifications on my Web page. 1 would be happy to post any useful 
programs or communications you might have; my e-mail address is wenders@cba.ua.edu. 

Many people made valuable suggestions for improving the organization, style, 
and clarity of the manuscript. I am grateful to my students, who kept me challenged 
and were quick to point out errors. Pierre Siklos and Mark Wohar made a number of 
important suggestions concerning the revised chapters. They also tested many of the 
programs used to obtain the estimations reported in the text. Pin Chung, Maria 
Crawford, and Jingan Yuan were especially helpful in carefully reading the many 
drafts of the manuscript and ferreting out numerous mistakes. Harvey Cutler, 
Selahattin Dibooglu, and Barry Falk made many helpful suggestions for the first edi- 
tion that carry over to this edition. People 1 never met (including Denise Young, Jung 
Hoon Keem, and Celal Kuguker) wrote very nice letters pointing out a number of 
mistakes and errors in the first edition. Most of all, I would like to thank my loving 
wife, Linda, for putting up with me while I was working on the manuscript. 



; irtt f -j-J fetiU'iU - lite-Sife 




ABOUT THE AUTHOR 



Walter Enders is the Lee Bidgood Professor of Economics at the University of 
Alabama. He received his doctorate in Economics in 1975 from Columbia University 
in New York. Dr. Enders’ current research focuses on the development and applica- 
tion of time-series models to areas in economics and finance. Dr. Enders has pub- 
lished numerous research articles in such journals as the Review of Economics and 
Statistics, Quarterly Journal of Economics, and the Journal of International 
Economics. He has' also published articles in the American Economic Review (a jour- 
nal of the American Economic Association), the Journal of Business and Economic 
Statistics (a journal of the American Statistical Association), and the American 
Political Science Review (a journal of the American Political Science Association). 
He has formal editorial responsibilities for three different journals in the area of inter- 
national economics and has served as a policy advisor to Ukraine. He is the current 
holder (along with Todd Sandler) of the National Academy of Sciences’ Estes Award 
for Behavioral Research Relevant to the Prevention of Nuclear War. The award rec- 
ognizes “ ... basic research in any field of cognitive or behavioral science that has 
employed rigorous formal or empirical methods, optimally a combination of these, to 
advance our understanding of problems or issues relating to the risk of nuclear war.” 
The National Academy presented the award for their “... joint work on transnational 
terrorism using game theory and time series analysis to document the cyclic and shift- 
ing nature of terrorist attacks in response to defensive counteractions.” 




x 




CONTENTS 

PREFACE viii 

ABOUT THE A UTHORS x 



Chapter i DIFFERENCE EQUATIONS i 

Introduction 1 

1 Time-Series Models 1 

2 Difference Equations and Their Solutions 6 

3 Solution by Iteration 9 

4 An Alternative Solution Methodology 14 

5 The Cobweb Model 17 

6 Solving Homogeneous Difference Equations 22 

7 Particular Solutions for Deterministic Processes 30 

8 The Method of Undetermined Coefficients 33 

9 Lag Operators 38 

10 Summary 41 
Questions and Exercises 42 
Endnotes 44 

Appendix 1.1: Imaginary Roots and de Moivre’s Theorem 44 
Appendix 1.2: Characteristic Roots in Higher-Order Equations 46 

Chapter 2 STATIONARY TIME-SERIES MODELS 48 

1 Stochastic Difference Equation Models 48 

2 ARM A Models 51 

3 Stalionarity 52 

4 Stationary Restrictions for an ARMA (p, <7) Model 56 

5 The Autocorrelation Function 60 

6 The Partial Autocorrelation Function 65 

7 Sample Autocorrelations of Stationary Series 67 

8 Box-Jenkins Model Selection 76 f 



xi 






• 



Xii CONTENTS 

9 Properties of Forecasts 79 

10 A Model of the Producer Price Index 87 

11 Seasonality 93 

12 Summary and Conclusions 99 
Questions and Exercises 100 
Endnotes 104 

Appendix 2.1: Estimation of an MA(1) Process 104 
Appendix 2.2: Model Selection Criteria 105 

Chapter 3 MODELING VOLATILITY 108 

1 Economic Time Series: The Stylized Facts 108 

2 ARCH Processes 112 

3 ARCH and GARCH Estimates of Inflation 120 

4 A GARCH Model of the PP1: An Example 123 

5 A GARCH Mode! of Risk 127 

6 The ARCH-M Model 129 

7 Additional Properties of GARCH Processes 132 

8 Maximum Likelihood Estimation of GARCH Models 138 

9 Other Models of Conditional Variance 140 

10 Estimating the NYSE Composite Index 143 

11 Summary and Conclusions 150 
Questions and Exercises 151 
Endnotes 1 55 

Chapter 4 MODELS WITH TREND 15® 

1 Deterministic and Stochastic Trends 157 

2 Removing the Trend 164 

3 Unit Roots and Regression Residuals 170 

4 The Monte Carlo Method 175 

5 Dickey Fuller Tests 181 

6 Examples of the Dickey-Fuller Test 185 

7 Extensions of the Dickey-Fuller Test 189 

8 Structural Change 200 

9 Power and the Deterministic Regressors 207 

10 Trends and Univariate Decompositions 215 

11 Panel Unit Root Tests 225 

12 Summary and Conclusions 229 



CONTENTS Xlii 

Questions and Exercises 230 
Endnotes 233 
Appendix: The Bootstrap 234 
Endnotes 238 

Chapter 5 MULTIEQUATION TIME-SERIES MODELS 239 

1 Intervention Analysis 240 

2 Transfer Function Models 247 

3 Estimating a Transfer Function 257 

4 Limits to Structural Multivariate Estimation 261 

5 Introduction to VAR Analysis 264 

6 Estimation and Identification 269 

7 The Impulse Response Function 272 

8 Testing Hypotheses 281 

9 Example of a Simple VAR: Terrorism and Tourism in Spain 287 

10 Structural VARs 291 

11 Examples of Structural Decompositions 295 

12 The Blanchard-Quah Decomposition 301 

1 3 Decomposing Real and Nominal Exchange Rates: An Example 307 

14 Summary and Conclusions 310 
Questions and Exercises 311 
Endnotes 317 

Chapter 6 COINTEGRATION AND ERROR-CORRECTION 
MODELS 319 



§ 

1 

| 



1 Linear Combinations of integrated Variables 320 

2 Cointegration and Common Trends 325 

3 Cointegration and Error Correction 328 

4 Testing for Cointegration: The Engle-Granger Methodology 335 

5 Illustrating the Engle-Granger Methodology 339 

6 Cointegration and Purchasing Power Parity 344 

7 Characteristic Roots, Rank, and Cointegration 347 

8 Hypothesis Testing 354 

9 Illustrating the Johansen Methodology 362 

10 Gcneral-to-Specific Modeling 366 

1 1 Summary and Conclusions 372 
Questions and Exercises 373 
Endnotes 377 







? 'y* 'y - I 

Xiv CONTENTS 

Appendix 6.1: Inference on a Cointegrating Vector 378 
i Appendix 6.2: Characteristic Roots, Stability, and Rank 381 

Chapter 7 NONLINEAR TIME-SERIES MODELS 387 

1 Linear versus Nonlinear Adjustment 387 

2 Simple Extensions of the ARMA Model 390 

3 Threshold Autoregressive Models 393 

4 Extensions and Other Nonlinear Models 399 

5 Testing for Nonlinearity 406 

6 Estimates of Regime Switching Models 414 

7 Generalized Impulse Responses and Forecasting 423 

8 Unit Roots and Nonlinearity 429 

9 Summary and Conclusions 434 
Questions and Exercises 435 
Endnotes 438 

STATISTICAL TABLES 439 
REFERENCES 445 
INDEX 452 




CHAPTER I | 

DIFFERENCE EQUATIONS 



INTRODUCTION 



The theory of difference equations underlies all of the time-series methods employed in 
later chapters of this text. It is fair to say that time-series econometrics is concerned with 
the estimation of difference equations containing stochastic components. The tradi- 
tional use of time-series analysis was to forecast the time path of a variable. Uncovering 
the dynamic path of a series improves forecasts since the predictable components of the 
series can be extrapolated into the future. The growing interest in economic dynamics 
has given a new emphasis to time-series econometrics. Stochastic difference equations 
arise quite naturally from dynamic economic models. Appropriately estimated equa- 
tions can be used for the interpretation of economic data and for hypothesis testing. 

This introductory chapter has three aims: 

1. Explain how stochastic difference equations can be used for forecasting and 
illustrate how such equations can arise from familiar economic models. The 
chapter is not meant to be a treatise on the theory of difference equations. 
Only those techniques that are essential to the appropriate estimation of lin- 
ear time-series models are presented. This chapter focuses on single equa- 
tion models; multivariate models are considered in Chapters 5 and 6. 

2. Explain what it means to solve a difference equation. The solution will deter- 
mine whether a variable has a stable or an explosive time path. Knowledge 
of the stability conditions is essential to understanding the recent innovations 
in time-series econometrics. The contemporary time-series literature pays 
special attention to the issue of stationary versus nonstationary variables. 

The stability conditions underlie the conditions for stationarity. 

3. Demonstrate how to find the solution to a stochastic difference equation. 
There are several different techniques that can be used; each has its own 
relative merits. A number of examples are presented to help you understand 
the different methods. Try to work through each example carefully. For 
extra practice, you should answer the exercises at the end of the chapter. 

1. TIME-SERIES MODELS 



The task facing the modern time-series econometrician is to develop reasonably sim- 
ple models capable of forecasting, interpreting, and testing hypotheses concerning 



1 






2 CHAPTER 1 DIFFERENCE EQUATIONS 

economic data. The challenge has grown over time; the original use of time-series 
analysis was primarily as an aid to forecasting. As such, a methodology was devel- 
oped to decompose a series into a trend, a seasonal, a cyclical, and an irregular com- 
ponent. Uncovering the dynamic path of a series improves forecast accuracy because 
each of the predictable components can be extrapolated into the future. 

Suppose you observe the fifty data points shown in Figure 1.1 and are interested 
in forecasting the subsequent values. Using the time-series methods discussed in the 
next several chapters, it is possible to decompose this series into the trend, seasonal, 
and irregular components shown in the lower panel of the figure. As you can see, the 
tiend changes the mean of the series, and the seasonal component imparts a regular 
cyclical pattern with peaks occurring every twelve units of time. In practice, the trend 
and seasonal components will not be the simplistic deterministic functions shown in 
this figure. With economic data, it is typical to find that a series contains stochastic 
elements in the trend, seasonal, and irregular components. For the time being, it is 
wise to sidestep these complications so that the projection of the trend and seasonal 
components into periods 51 and beyond is straightforward. 





FIGURE 1.1 Hypothetica 1 Time Series 



TIME-SERIES MODELS 3 

Notice that the irregular component, while lacking a well-defined pattern, is some- 
what predictable. If you examine the figure closely, you will see that the positive and 
negative values occur in runs; the occurrence of a large value in any period tends to be 
followed by another large value. Short-run forecasts will make use of this positive cor- 
relation in the irregular component. Over the entire span, however, the irregular com- 
ponent exhibits a tendency to revert to zero. As shown in the lower part, the projection 
of the irregular component past period 50 rapidly decays toward zero. The overall fore- 
cast, shown in the top part of the figure, is the sum of each forecasted component. 

The general methodology used to make such forecasts entails finding the equa- 
tion of motion driving a stochastic process and using that equation to predict subse- 
quent outcomes. Let y, denote the value of a data point at period /; if we use this 
notation, the example in Figure 1 . 1 assumes we observed y t through y$ 0 . For t = 1 to 
50, the equations of motion used to construct components of they, series are 

Trend: T, = 1 +0.1/ 

Seasonal: S', = 1.6 sin(/7r/6) 

Irregular: /, = 0.7 /,_] + e, 

where: T, = value of the trend component in period t 
S, = value of the seasonal component in t 
1, = the value of the irregular component in / 
e t = a pure random disturbance in t 

Thus, the irregular disturbance in t is 70 percent of the previous period’s irregular dis- 
turbance plus a random disturbance term. 

Each of these three equations is a type of difference equation. In its most gen- 
eral form, a difference equation expresses the value of a variable as a function of its 
own lagged values, time, and other variables. The trend and seasonal terms are both 
functions of time and the irregular term is a function of its own lagged value and of 
the stochastic variable £,. The reason for introducing this set of equations is to make 
the point that time-series econometrics is concerned with the estimation of difference 
equations containing stochastic components. The time-series econometrician may 
estimate the properties of a single series or a vector containing many interdependent 
series. Both univariate and multivariate forecasting methods are presented in the text. 
Chapter 2 shows how to estimate the irregular part of a series. Chapter 3 considers 
estimating the variance when the data exhibit periods of volatility and tranquility. 
Estimation of the trend is considered in Chapter 4, which focuses on the issue of 
whether the trend is deterministic or stochastic. Chapter 5 discusses the properties of 
a vector of stochastic difference equations, and Chapter 6 is concerned with the esti- 
mation of trends in a multivariate model. 

Although forecasting has always been the mainstay of time-series analysis, the 
growing importance of economic dynamics has generated new uses for time-series 
analysis. Many economic theories have natural representations as stochastic differ- 
ence equations. Moreover, many of these models have testable implications concern- 
ing the time path of a key economic variable. Consider the following three examples: 

1. The Random Walk Hypothesis: In its simplest form, the random walk 
model suggests that day-to-day changes in the price of a stock should have 



i? i fcfcsttfa M i tis'lLlH^ ii till;. 





4 CHAPTER 1 DIFFERENCE EQUATIONS 



a mean value of zero. After all, if it is known that a capital gain can be 
made by buying a share on day t and selling it for an expected profit the 
very next day, efficient speculation will drive up the current price. 

Similarly, no one will want to hold a stock if it is expected to depreciate. 
Formally, the model asserts that the price of a stock should evolve accord- 
ing to the stochastic difference equation 

y <+ 1 = y / + £ i + 1 

or 

A .v,+i = £ ,+\ 

where: y, = the price of a share of stock on day t 

£,+] = a random disturbance term that has an expected value of zero 

Now consider the more general stochastic difference equation: 

A Y<+ 1 = «o + a \)’i + G+i 

The random walk hypothesis requires the testable restriction: op = a i 
= 0. Rejecting this restriction is equivalent to rejecting the theory. Given 
the information available in period t, the theory also requires that the mean 
of £, +1 be equal to zero; evidence that e, +1 is predictable invalidates the ran- 
dom walk hypothesis. Again, the appropriate estimation of a single equa- 
tion model is considered in Chapters 2 through 4. 

2. Reduced-Form and Structural Equations: Often it is useful to collapse a 
system of difference equations into separate single-equation models. To 
illustrate the key issues involved, consider a stochastic version of 
Samuelson’s (1939) classic model: 

y, = c, + i, (1.1) 

c , = Q)',-\ + £ cl 0 < a < 1 (1.2) 

i, = ftc,-c,_i) + Sji 0>O (1.3) 

where y,, c„ and /, denote real GDP, consumption, and investment in time 
period t, respectively. In this Keynesian model,/,, c,, and /, are endoge- 
nous variables. The previous period’s GDP and consumption, /,_, and c ,_ h 
are called predetermined or lagged endogenous variables. The terms e c , 
and Ej, are zero mean random disturbances for consumption and invest- 
ment, and the coefficients a and 0 are parameters to be estimated. 

The first equation equates aggregate output (GDP) with the sum of 
consumption and investment spending. The second equation asserts that 
consumption spending is proportional to the previous period’s GDP plus 
a random disturbance term. The third equation illustrates the accelerator 
principle. Investment spending is proportional to the change in con- 
sumption; the idea is that growth in consumption necessitates new 
investment spending. The error terms e ct and e u represent the portions of 
consumption and investment not explained by the behavioral equations 
of the model. 



TIME-SERIES MODELS 5 



Equation (1.3) is a structural equation since it expresses the endoge- 
nous variable i, as being dependent on the current realization of another 
endogenous variable, c,. A reduced-form equation is one expressing the 
value of a variable in terms of its own lags, lags of other endogenous vari- 
ables, current and past values of exogenous variables, and disturbance 
terms. As formulated, the consumption function is already in reduced 
form; current consumption depends only on lagged income and the current 
value of the stochastic disturbance term e cr Investment is not in reduced 
form because it depends on current period consumption. 

To derive a reduced-form equation for investment, substitute (1.2) into 
the investment equation to obtain 

‘i = !%<%- 1 + £ c ~ c i - il + £ it 

= r.#,_ l ~0c,-\ +0 £ c + £ n 

Notice that the reduced-form equation for investment is not unique. 
You can lag (1 .2) one period to obtain: c,_| = ay,_ 2 + £ cl _\. Using this 
expression, the reduced-form investment equation can also be written as 

'/ = <#/-! - 2 + £ ct -\ ) + P £ a + £ il 

= afty,_ , -y,_ 2 ) + f%£ cl - £„_,) + e u ( 1 .4) 

Similarly, a reduced-form equation for GDP can be obtained by substi- 
tuting (1.2) and (1.4) into (1.1): 

v, = ay, _i + e ci + aftv,_, ->7-2) + iX £ ct ~ £ ci ~0 + £ <t 
= o( 1 +/%/_! - O0y t - 2 + ( 1 + 0)£ c , + f,7 - {k c ,-\ (1.5) 

Equation (1.5) is a univariate reduced-form equation;/, is expressed 
solely as a function of its own lags and disturbance terms. A univariate 
model is particularly useful for forecasting since it enables you to predict a 
series based solely on its own current and past realizations. It is possible to 
estimate (1.5) using the univariate time-series techniques explained in 
Chapters 2 through 4. Once you have obtained estimates of n and /, it is 
straightforward to use the observed values of tq through v, to predict all 
future values in the series (i.e.,/,+ |,/,+ 2 , ...). 

Chapter 5 considers the estimation of multivariate models when all 
variables are treated as jointly endogenous. The chapter also discusses the 
restrictions needed to recover (i.e., identify) the structural model from the 
estimated reduced-form model. 

3. Error-Correction: Forward and Spot Prices: Certain commodities and 
financial instruments can be bought and sold on the spot market (for 
immediate delivery) or for delivery at some specified future date. For 
example, suppose that the price of a particular foreign currency on the spot 
market is s, dollars and that the price of the currency for delivery one 
period into the future is /, dollars. Now, consider a speculator who pur- 
chased forward currency at the price/, dollars per unit. At the beginning of 
period t + 1, the speculator receives the currency and pays / dollars per unit 



6 CHAPTER 1 DIFFERENCE EQUATIONS 




received. Since spot foreign exchange can be sold at s, +] , the speculator 
can earn a profit (or loss) of s /+1 -f t per unit transacted. 

The Unbiased Forward Rate (UFR) hypothesis asserts that expected 
profits from such speculative behavior should be zero. Formally, the 
hypothesis posits the following relationship between forward and spot 
exchange rates: 

■'7+t =.// + %i (1-6) 

where c, +! has a mean value of zero from the perspective of time period t. 

In (1.6), the forward rate in t is an unbiased estimate of the spot rate 
in t+l. Thus, suppose you collected data on the two rates and estimated 
the regression 

s, +) = » 0 + «,/, + £ l+] 

If you were able to conclude that cv 0 = 0, ot\ = I , and that the regres- 
sion residuals | have a mean value of zero from the perspective of time 
period /, the UFR hypothesis could be maintained. 

The spot and forward markets are said to be in long-run equilibrium 
when c, + | = 0. Whenever 5, + | turns out to differ from/, some sort of 
adjustment must occur to restore the equilibrium in the subsequent period. 
Consider the adjustment process 



s t+2 - %1 “ «( i 7+I “.//] + e si+2 


a > 0 


(1.7) 


ft+ 1 ~ ft + ft[s l+ \ -/] + £jj + [ 


fj> 0 


(1.8) 


and £r l+l both have a mean value of zero. 





Equations (1.7) and (1.8) illustrate the type of simultaneous adjustment 
mechanism considered in Chapter 6. This dynamic model is called an error- 
correction model because the movement of the variables in any period is 
related to the previous period’s gap from long-run equilibrium. If the spot 
rate i', +t turns out to equal the forward rate f t , ( 1 .7) and ( 1 .8) state that the 
spot rate and forward rates are expected to remain unchanged. If there is a 
positive gap between the spot and forward rates so that .y /+] -/ > 0, ( 1 .7) 
and (1.8) lead to the prediction that the spot rate will fall and the forward 
rate will rise. 

2. DIFFERENC E EQ UATIONS AND THEIR SOL UTIONS 

Although many of the ideas in the previous section were probably familiar to you, it 
is necessary to formalize some of the concepts used. In this section, we will examine 
the type of difference equation used in econometric analysis and make explicit what 
it means to “solve” such equations. To begin our examination of difference equations, 
consider the function y - fit). If we evaluate the function when the independent vari- 
able t takes on the specific value t*, we get a specific value for the dependent variable 
called v,*. Formally, >>,* = fit*). Using this same notation ,y,*+/, represents the value 
of v when t takes on the specific value t* + h. The first difference of y is defined as 



DIFFERENCE EQUATIONS ANOTHEIR SOLUTIONS 7 

the value of the function when evaluated at t-t* + h minus the value of the function 
evaluated at /*: 

±y t * +h = f0*+h)-f0*) 

= y ( 1 . 9 ) 

Differential calculus allows the change in the independent variable (i.e., the term 
h) to approach zero. Since most economic data is collected over discrete periods, 
however, it is more useful to allow the length of the time period to be greater than 
zero. Using difference equations, we normalize units so that h represents a unit 
change in t (i.e., h = 1) and consider the sequence of equally spaced values of the 
independent variable. Without any loss of generality, we can always drop the asterisk 
on I*. We can then form the first differences: 

4 y,=fit) -fit-\)=y,-y,~\ 

4>7+i =A'+i)-7W =y ,+ 1 -y, 

Ay ,+ 2 =fit+ 2) -fit+\) = y l+ 2-y l+ \ 

Often it will be convenient to express the entire sequence of values {...y,_ 2 ,J 7 _|, 
y,, y,+\,y,+ 2 -> as [y,{. We can then refer to any particular value in the sequence as 
y,. Unless specified, the index t runs from -oo to +co. In time-series econometric 
models, we use / to represent “time” and h to represent the length of a time period. 
Thus,.)’, and y, + | might represent the realizations of the {y,} sequence in the first and 
second quarters of 2004, respectively. 

In the same way we can form the second difference as the change in the first dif- 
ference. Consider 

A 2 y, = A(Ay,) = A(y,->Wi) = 07 "37- 1) ~ 07-1 -J7-2) = » ~ 2 37-l +77-2 
A 2 T,+i = A(Ay, +l ) = AOvh -}>,) = 07+1 "37) ~ 0) ~37-l) = 37+i “ 2 37 +37-1 

The «th difference (A") is defined analogously. At this point, we risk taking the 
theory of difference equations too far. As you will see, the need to use second differ- 
ences rarely arises in time-series analysis. It is safe to say that third and higher-order 
differences are never used in applied work. 

Since this text considers linear time-series methods, it is possible to examine 
only the special case of an nth-order linear difference equation with constant coeffi- 
cients. The form for this special type of difference equation is given by 

n 

yt- a o+J2 a ‘ y t-i + x i (i.io) 

i=i 

The order of the difference equation is given by the value of n. The equation is lin- 
ear because all values of the dependent variable are raised to the first power. Economic 
theory may dictate instances in which the various are functions of variables within the 
economy. However, as long as they do not depend on any of the values ofy, orar,, we 
can regard them as parameters. The term x, is called the forcing process. The form of 
the forcing process can be very general; x , can be any function of time, current and 
lagged values of other variables, and/or stochastic disturbances. From an appropriate 
choice of the forcing process, we can obtain a wide variety of important macroeconomic 



un— wi i -imi m a t — hmmm— 





8 CHAPTER 1 DIFFERENCE EQUATIONS 

models. Re-examine equation (1.5), the reduced-form equation for GDR This equation 
is a second-order difference equation since y, depends ony,_ 2 . The forcing process is the 
expression (1 +P)£ C , + £,, - /3fs c ,_ j . You will note that (1.5) has no intercept term corre- 
sponding to the expression a 0 in (1.10). 

An important special case for the {x,} sequence is 

OO 

Xl ~ Yftj £l-i 

1=0 

where the P, are constants (some of which can equal zero) and the individual elements 
of the sequence {e,} are not functions of they,. At this point it is useful to allow the 
{e,} sequence to be nothing more than a sequence of unspecified exogenous vari- 
ables. For example, let {<£,} be a random error term and set p (l = 1 and P t =/% = ... = 
0; in this case, (1.10) becomes the autoregression equation: 

y, = «() + «|Tr-i + ay, -2 + ■ ■ ■ + + £) 

Let n = 1 , = 0 and = 1 to obtain the random walk model. Notice that equa- 

tion (1.10) can be written in terms of the difference operator (A). Subtracting y,_| 
from (1.10), we obtain 

n 

y, - y,~\ - a o + ("i _ 1 )y,-i + a i y i-i + x , 

i= 2 

or defining 7 = (nj-1), we get 

n 

Ay, = a 0 + 7.y,_i + aiy,_i + x, (1.11) 

i= 2 

Clearly, equation (1.11) is simply a modified version of (1.10). 

A solution to a difference equation expresses the value of y, as a function of the 
elements of the {.v,} sequence and t (and possibly some given values of the {v, 1 
sequence called initial conditions). Examining (1.11) makes it clear that there is a 
strong analogy to integral calculus, where the problem is to find a primitive function 
from a given derivative. We seek to find the primitive function /(/), given an equa- 
tion expressed in the form of (1.10) or (1.11). Notice that a solution is a function 
rather than a number. The key property of a solution is that it satisfies the difference 
equation for all permissible values of / and {„v,5. Thus, the substitution of a solution 
into the difference equation must result in an identity. For example, consider the sim- 
ple difference equation Ay, = 2 (or y, = y,_ ( + 2). You can easily verify that a solu- 
tion to this difference equation is y, = 2 1 + c, where c is any arbitrary constant. By 
definition, if 2/ + c is a solution, it must hold for all permissible values of /. Thus for 
period t-\,y,~ t = 2(/-l) + c. Now substitute the solution into the difference equation 
to form 

2t+c = 2(t~\) + c + 2 (1.12) 

It is straightforward to carry out the algebra and verify that (1.12) is an identity. 
This simple example also illustrates that the solution to a difference equation need not 
be unique; there is a solution for any arbitrary value of c. 




SOLUTION BY ITERATION 9 



Another useful example is provided by the irregular term shown in Figure 1.1; 
recall that the equation for this expression is: I, = 0.7 /,_] + e,. You can verify that the 
solution to this first-order equation is 

(X) 

/,=B°- 7 )V/ (i.i3) 

,'=0 

Since (1.13) holds for all time periods, the value of the irregular component in 
/ - I is given by 

CXJ 

//-I =B (0 - 7 )' (M4) 

1=0 

Now substitute (1.13) and (1.14) into /, = 0.7/,_| + s, to obtain 

e,+0.7£- ; _,+(0.7) 2 £,_rE(0.7)-V,_3 + ... 

= 0-7[e,_,+0.7f,„ 2 +(0.7)2 £> _ 3 +(0.7)-V,^ +...] + s, (1.15) 

The two sides of (1.15) are identical; this proves that (1.13) is a solution to the 
first-order stochastic difference equation /, = 0 . 7 ( + s,. Be aware of the distinction 
between reduced-form equations and solutions. Since /, = 0.7/,_| + e, holds for all val- 
ues of/, it follows that /y_j = 0.7/,_2 + e,_\. Combining these two equations yields 

1, = 0.7[0.7/,_2 + = 0.49/ ( _2 + 0.7c, + e, (1.1 6) 

Equation (1.16) is a reduced-form equation since it expresses I, in terms of iis 
own lags and disturbance terms. However, (1.16) does not qualify as a solution 
because it contains the “unknown” value of I,_ 2 . To qualify as a solution, (1.16) must 
express /, in terms of the elements x,, /, and any given initial conditions. 

3. SOLUTION B Y ITER ATION 

The solution given by (1.13) was simply postulated. The remaining portions of this 
chapter develop the methods you can use to obtain such solutions. Each method has 
its own merits; knowing the most appropriate to use in a particular circumstance is a 
skill that comes only with practice. This section develops the method of iteration. 
Although iteration is the most cumbersome and time-intensive method, most people 
find it to be very intuitive. 

If the value of y in some specific period is known, a direct method of solution is 
to iterate forward from that period to obtain the subsequent time path of the entire y 
sequence. Refer to this known value of y as the initial condition or the value of y in 
time period 0 (denoted by Vp). It is easiest to illustrate the iterative technique using 
the first-order difference equation: 

y, ~ a Q + a \ . v i-\ + s i 0 - 17 ) 

Given the value ofy 0 , it follows that iq will be given by 

,V| = «„ + «, y 0 + *T 

In the same way, j-y must be 

y 2 = a o + «] v’| + e 2 

= «0 + "|K + «I.Vo + c l ] + C2 
= « () + a t) oj + (o,) 2 V() + o,c| + e 2 





10 CHAPTER 1 DIFFERENCE EQUATIONS 

Continuing the process in order to find_p 3 , we obtain 

„v 3 = a () + cr,y 2 + Q = o 0 [l + O) + (a,) 2 ] + (a,) 3 ^ + «i 2 G + 0^2 + % 
You can easily verify that for all 1 > 0, repeated iteration yields 
1 - 1 /-I 

y, = cio E/'i 4 a\ v 0 + E/'l e,-i (1.18) 

/=() /=() 

Equation (1.18) is a solution to ( 1 . 1 7) since it expresses y, as a function of /. the 
forcing process x, = S(« ( )'£>_/, and the known value of _y () . As an exercise, it is useful 
to show that iteration from y, back to v () yields exactly the formula given by (1.18). 
Since y, = + a | V,_| + s t , it follows that 

y 1 = "it + c, i [«o + a \y i-2 + f/-i] + e, 

= o 0 (l+a,) + + £, + o, 2 [o 0 + 0|V,_3 + £ i- 2 ] 

Continuing the iteration back to period 0 yields equation (1.18). 

Iteration without an Initial Condition 

Suppose you were not given the initial condition for fq. The solution given by (1.18) 
would no longer be appropriate because the value of Vq is an unknown. You could not 
select this initial value of v and iterate forward, nor could you iterate backward from 
y, and simply choose to stop at t = t 0 . Thus, suppose we continued to iterate backward 
by substituting o 0 + q | v_ | + c() for go • n (1-1 8): 

/-I r - 1 Y 

. v l =<i oE a i 4 a \ 4 a l- v _| + -ol + X/M £ l-i 

1=0 1=0 

t t 

=«oE «', 4 EE e '~ < 4 a i +l >'-i (i.i9) 

1=0 i =0 

Continuing to iterate backward another m periods, we obtain 

t+m t+ni 

T,= a oE«i + E a i^+ a r + V- m -i . d-20) 

/=o /=o 

Now examine the pattern emerging from (1.19) and (1.20). If |er, | < 1, the term 

c( !»+"/+ 1 approaches zero as m approaches infinity. Also, the infinite sum [1 + + 

(cii)- + ...] converges to 1/(1 - a t ). Thus, if we temporarily assume that j« ( | < 1, after 
continual substitution, (1.20) can be written as 

OQ ; 

r, = «o /0-fl|) + E«i £ /-/ (1.21) 

/= 0 

You should take a few minutes to convince yourself that (1.21) is a solution to 
the original difference equation (1.17); substitution of (1.21) into (1.17) yields an 







SOLUTION BY ITERATION 1 1 



identity. However, (1.21) is not a unique solution. For any arbitrary value of A, a solu- 
tion to (1.17) is given by 

co '• ■■■■.. 

y, = Aa\+a 0 l{\-a y ) + Y J a\ £ t -i (1.22) 

/=() 

To verify that for any arbitrary value of A (1 .22) is a solution, substitute (1 .22) 
into (1.17) to obtain 

°o OO 

f/ 0 /(l -«,) + +J2 a '\ £ t-i = “() +«|[U()/(1-Ui) + Aaf -1 + E a< i + £ i 

i'=0 i=o 

Since the two sides are identical, ( 1 .22) is necessarily a solution to ( 1 . 1 7). 

Reconciling the Two Iterative Methods 

Given the iterative solution ( 1 .22), suppose that you are now given an initial condition 
concerning the value ofy in the arbitrary period t 0 . It is straightforward to show that 
we can impose the initial condition on (1.22) to yield the same solution as (1.18). Since 
( 1 .22) must be valid for all periods (including tf), when t - 0, it must be true that ; 

OO ' 

V 0 = A + £i 0 /(!-«,) + Y^a 'f-i so that 



A = y 0 - a 0 /( 1 - a, ) - £-i (1-23) 

/=o 

Since jo is given, we can view (1.23) as the value of A that renders (1.22) a solu- 
tion to (1.17) given the initial condition. Hence, the presence of the initial condition 
eliminates the arbitrariness of A. Substituting this value of/1 into (1.22) yields 

OO OO 

y, = b’o-«o /(i-«i)— y~Vi £ -ii«i + "o /(i -«i ) + Yi a \ e t ~i ( ! -24) 

i=0 /= 0 

Simplification of (1.24) results in 

/-I 

.v,=[y 0 - a o /(i — )j«i + « 0 /(i-«i) + E a K-/ (1-25) 

i=0 

You should take a moment to verify that (1.25) is identical to (1.18). 

Nonconvergent Sequences 



Given that (czj | < 1, (1.21) is the limiting value of (1.20) as m grows infinitely large. 
What happens to the solution in other circumstances? If ja, | > 1 , it is not possible to 
move from (1.20) to (1.21) because the expression |ci|j ,+m grows infinitely large as 
t+m approaches infinity. 1 However, if there is an initial condition, there is no need to 







12 CHAPTER 1 DIFFERENCE EQUATIONS 

obtain the infinite summation. Simply select the initial condition y 0 and iterate for- 
ward; the result will be ( 1 . 1 8): 

i-l /-l 

y l = ao'}2 a \ + a\ yo + Y^ a ‘\ £t-i 

/=0 i =0 

Although the successive values of the {y,} sequence will become progressively 
larger in absolute value, all values in the series will be finite. 

A very interesting case arises if 0 | = 1. Rewrite (1.17) as: 



4)7 = «o + <7 

As you should verify by iterating from v, back toy 0 , a solution to this equation is 2 

I 

>’, - a o' + Yl £ i + - v 0 ( 1 -26) 

i ! 

After a moment’s reflection, the form of the solution is quite intuitive. In every 
period /, the value of y, changes by + e t units. After / periods, there are / such 
changes; hence, the total change is to 0 plus the t values of the {er, } sequence. Notice 
that the solution contains summation of all disturbances from £j through e,. Thus, 
when oj = 1, each disturbance has a permanent non-decaying effect on the value of 
y,. You should compare this result to the solution found in (1.21). For the case in 
which |<3|| < 1, |o||' is a decreasing function of / so that the effects of past distur- 
bances become successively smaller over time. 

The importance of the magnitude of <r/| is illustrated in Figure 1 .2. Twenty-five 
random numbers with a theoretical mean equal to zero were computer-generated 
and denoted by £\ through ejs- Then the value ofyy was set equal to unity and the 
next twenty-five values of the {y,} sequence were constructed using the formula 
y / = 0.9y,_| + e,. The result is shown by the thin line in Panel (a) of Figure 1.2. If you 
substitute < 2 q = 0 and aj = 0.9 into (1.18), you will see that the time path of {y,} con- 
sists of two parts. The first part, 0.9', is shown by the slowly decaying thick line in 
the panel. This term dominates the solution for relatively small values of/. The influ- 
ence of the random part is shown by the difference between the thin and the thick line; 
you can see that the first several values of {£,} are negative. As / increases, the influ- 
ence of the random component becomes more pronounced. 

Using the previously drawn random numbers, we again setyg equal to unity and 
a second sequence was constructed using the formula y, = 0.5y,_ ( + £,. This second 
sequence is shown by the thin line in Panel ( b ) of Figure 1.2. The influence of the 
expression 0.5' is shown by the rapidly decaying thick line. Again, as / increases, the 
random portion of the solution becomes more dominant in the time path of \y , } . 
When we compare the first two panels, it is clear that reducing the magnitude of |r/| [ 
increases the rate of convergence. Moreover, the discrepancies between the simulated 
values of y, and the thick line are less pronounced in the second panel. As you can see 
in (1.18), each value of enters the solution for y, with a coefficient of (a, )', The 



Panel (a) 

Yt = — 0.5 y',_ 1 +£ 



Panel (£>) 
Vi = Yi - 1 + 




0 5 10 15 20 25 ' 0 5 10 15 20 25 



Panel (c) Panel (d) 




FIGURE 1.2 Convergent and Nonconvergent Sequences 

smaller value of a | means that the past realizations of £,_, have a smaller influence on 
the current value of y r 

Simulating a third sequence with a i = -0.5 yields the thin line shown in Panel (c). 
The oscillations are due to the negative value of a\. The expression (-0.5)', shown by 
the thick line, is positive when / is even and negative when / is odd. Since |r/|| < 1, the 
oscillations are dampened. 

The next three panels in Figure 1.2 all show nonconvergent sequences. Each uses 
the initial condition y () = 1 and the same twenty-five values of {;,} used in the other 













14 CHAPTER 1 DIFFERENCE EQUATIONS 

simulations. The thin line in Panel ( d ) shows the time path of>’ ; = y,_j + £,. Since each 
value of e t has an expected value of zero, Panel ( d) illustrates a random-walk process. 
Here A y, = e, so that the change in y, is random. The nonconvergence is shown by the 
tendency of \y,} to meander. In Panel (e), the thick line representing the explosive 
expression (1.2)' dominates the random portion of the {y,} sequence. Also notice that 
the discrepancy between the simulated \y t } sequence and the thick line widens as t 
increases. The reason is that past values of £,_ , enter the solution fory>, with the coef- 
ficient ( 1 .2)'. As / increases, the importance of these previous discrepancies becomes 
increasingly important. Similarly, setting a | = -1.2 results in the exploding oscilla- 
tions shown in the lower-right panel of the figure. The value (-1.2)' is positive for 
even values of t and negative for odd values of t. 

4. AN ALTERNATI VE SOLU TION M ETHOD OLOGY 

Solution by the iterative method breaks down in higher-order equations. The alge- 
braic complexity quickly overwhelms any reasonable attempt to find a solution. 
Fortunately, there are several alternative solution techniques that can be helpful in 
solving the /?th-order equation given by (1.10). If we use the principle that you 
should learn to walk before you learn to run, it is best to step through the first-order 
equation given by (1.17). Although you will be covering some familiar ground, the 
first-order case illustrates the general methodology extremely well. To split the pro- 
cedure into its component parts, consider only the homogeneous portion of (1.17): 3 

y, = cr,y,_i (1.27) 

The solution to this homogeneous equation is called the homogeneous solution; 
at times it will be useful to denote the homogeneous solution by the expression y, . 
Obviously, the trivial solution y , = = . . . = 0 satisfies ( 1 .27). However, this solu- 

tion is not unique. By setting a 0 and all values of {t,} equal to zero, (1.18) becomes 
y, = a'iy 0 . Hence, y, = a\y^ must be a solution to (1.27). Yet, even this solution does 
not constitute the full set of solutions. It is easy to verify that the expression a\ mul- 
tiplied by any arbitrary constant A satisfies (1 .27). Simply substitute y, = Aa\ and v,_| 
= Au\ '* into (1.27) to obtain 

Aa\ = rt| Aa\ 1 

Since a\ = a\a\ it follows that y, = Aa\ also solves (1.27). With the aid of the 
thick lines in Figure 1 .2, we can classify the properties of the homogeneous solution 
as follows: 

1. If |fl|| < 1, the expression a\ converges to zero as t approaches infinity. 
Convergence is direct if 0 < a ] < 1 and oscillatory if — 1 < a t <0. 

2. If jo, | > 1, the homogeneous solution is not stable. If a ] > 1 , the homoge- 
neous solution approaches infinity as t increases. If < -1 , the homoge- 
neous solution oscillates explosively. 

3. If <7| = 1, any arbitrary constant A satisfies the homogeneous equation 
y t — v, _j. If o>| = -1, the system is meta-stable: a\ = 1 for even values of 
i and -1 for odd values of t. 







AN ALTERNATIVE SOLUTION METHODOLOGY 15 

Now consider (1 . 17) in its entirety. In the last section, you confirmed that (1.21) | 

is a valid solution to (1.1 7). Equation (1.21) is called a particular solution to the dif- j 

ference equation; all such particular solutions will be denoted by the tern lyf. The * 

term “particular” stems from the fact that a solution to a difference equation may not f 

be unique; hence, (1.21) is just one particular solution out of the many possibilities. I 

In moving to ( 1 .22) you verified that the particular solution was not unique. The | 

homogeneous solution Aa\ plus the particular solution given by (1.21) constituted the I 
complete solution to (1 .17). The general solution to a difference equation is defined 1 
to be a particular solution plus all homogeneous solutions. Once the general solution | 

is obtained, the arbitrary constant A can be eliminated by imposing an initial condi- I 

lion for v 0 . J 

| 

The Solution Methodology 1 

The results of the first-order case are directly applicable to the «th-order equation 
given by (1.10). In this general case, it will be more difficult to find the particular 
solution and there will be n distinct homogeneous solutions. Nevertheless, the solu- 
tion methodology will always entail the following four steps: 

7 

STEP 1 : Form the homogeneous equation and find all n homogeneous solutions; | 

STEP 2: Find a particular solution; | 

STEP 3: Obtain the general solution as the sum of the particular solution and a lin- | 
ear combination of all homogeneous solutions; § 

STEP 4: Eliminate the arbitrary constant(s) by imposing the initial condition(s) on 
the general solution. 

Before we address the various techniques that can be used to obtain homogeneous 
and particular solutions, it is worthwhile to illustrate the methodology using the equation; 

y, = 0.9y,_, - Q.2y t _2 + 3 (1.28) 

Clearly, this second-order equation is in the form of (1 . 10) with o 0 = 3, cq = 0.9, 

«2 = -0.2, and x, = 0. Beginning with the first of the four steps, form the homoge- 
neous equation: 

y ,~ 0.9v,_i + 0 . 2 v ,_ 2 = 0 (1.29) 

In the first-order case of (1.17), the homogeneous solution was Aa[. Section 
6 will show you how to find the complete set of homogeneous solutions. For now, 
it is sufficient to assert that the two homogeneous solutions are: y'{, = (0.5)' and 
yA, - (0.4)'. To verify the first solution, note that = (0.5)'” 1 and y l f,_ 2 - 
(0.5)'~ 2 . Thus, vj 1 , is a solution if it satisfies 

(0.5)' - 0.9(0. 5)'-' + 0.2(0.5)'* 2 = 0 

If we divide by (0.5 )'~ 2 , the issue is whether 

(0.5) 2 - 0.9(0. 5) + 0.2 = 0 

Carrying out the algebra, 0.25 - 0.45 + 0.2 does equal zero so that (0.5)' is a solu- 
tion to ( 1 .29). In the same way, it is easy to verify that y{, ~ (0.4)' is a solution since 

(0.4)' - 0.9(0.4)'~ 1 + 0.2(0.4)'- 2 = 0 




ifj': 

16 CHAPTER 1 DIFFERENCE EQUATIONS 

Divide by (0.4)'- 2 to obtain (0.4) 2 - 0.9(0.4) + 0.2 = 0.16 - 0.36 + 0.2 = 0. 

The second step is to obtain a particular solution; you can easily confirm that the 
particular solution jy 7 = 10 solves (1.28) as: 10 = 0.9(10) - 0.2(10) + 3. 

The third step is to combine the particular solution and a linear combination of 
both homogeneous solutions to obtain 

y, = AfQ.5)' + A 2 (0Ay + 10 

where A, and A 2 are arbitrary constants. 

For the fourth step, assume you have two initial conditions for the {y.) 
sequence. So that we can keep our numbers reasonably round, suppose that_p 0 = 13 
and Vj = 1 1 .3. Thus, for periods zero and one, our solution must satisfy 

/, 13 = /!, +A 2 +10 

! 11.3 =A,(0.5)+A 2 (0.4)+ 10. 

Solving simultaneously for /!, and A 2 , you should find/!, = 1 and/) 2 = 2. Hence, 
the solution is 

; .v, = (0.5)' t 2(0.4)' . 10 

Generalizing the Method 

To show that this method is applicable to higher-order equations, consider the homo- 
geneous part of (1.10): 

n 

•V, = E ' 0-30) 

;=i 

As shown in Section 6, there are n homogeneous solutions that satisfy ( 1 .30). for 
now, it is sufficient to demonstrate the following proposition: // vf is a homogeneous 
solution to (1.30), Ay j' is also a solution for any arbitrary constant A. By assumption, 
y h t solves the homogeneous equation so that 

; (i.3i) 

i=\ 

The expression Ay ] \ is also a solution if: 

(i.32) 

i=i 

We know (1.32) is satisfied because dividing each term by A yields (1.31). Now 
suppose that there are two separate solutions to the homogeneous equation denoted 
by y'{, and y It is straightforward to show that for any two constants A , and A 2 , the 
linear combination Ay 1 ;, + A-y'f is also a solution to the homogeneous equation. If 
A i y h u + Ay 1 ], is a solution to (1 .30), it must satisfy 

A I y['i + A 2 >2 1 = a \ ( A 1 >'lr— 1 + A 2 y'lt-O 

+a 2 (A | y'l,_ 2 + A 2 >2,_2 ) + ■■■ + a n ( A 1 + ^21-,) 



THE COIWEB MODEL 17 



Regrouping terms, we want to know if 

( A > y'u~ it A r a i yu-i) + (^2 y'h - Ev< 4_,) = o 

1=1 i 

Since A and A 2 y 2l are separate solutions to (1.30), each of the expressions in 
brackets is zero. Hence, the linear combination is necessarily a solution to the homo- 
geneous equation. This result easily generalizes to all n homogeneous solutions to an 
/Jth-order equation. 

Finally, the use of Step 3 is appropriate since the sum of any particular solution 
and any linear combination of all homogeneous solutions is also a solution. To prove 
this proposition, substitute the sum of the particular and homogeneous solutions into 
(1.10) to obtain 

>'t + y't = "0 + E"' O’f-i + >f-/) + x, ( 1 .33) 

f=l 

Recombining the terms in (1.33), we want to know if 



W - a 0 - y',’-i - x, ] + (. y 1 ; - E«, y','-iJ = 0 



Since y I’ solves (1.10), the expression in the first bracket of (1.34) is zero. Since 
y'l solves the homogeneous equation, the expression in the second bracket is zero. 
Thus, (1.34) is an identity; the sum of the homogeneous and particular solutions 
solves (1.1 0). 



5 . THE C OBWEB MODEL 

An interesting way to illustrate the methodology outlined in the previous section is to 
consider a stochastic version of the traditional cobweb model. Since the model was 
originally developed to explain the volatility in agricultural prices, let the market for 
a product — say wheat — be represented by 

d i = a ~lP, 7>0 (1.35) 

s,=b + /Jp'+£ l fi> 0 (1.36) 

s i = d t (1.37) 

where: d t = demand for wheat in period t 

s / = supply of wheat in < ; 

p, = market price of wheat in t 

p, = price that farmers expect to prevail at / 

£, = a zero mean stochastic supply shock 
and parameters a, b , 7, and /fare all positive such that a > b , 4 ’ 

The nature ol the model is such that consumers buy as much wheat as is desired 
at the market clearing price p,. At planting time, farmers do not know the price pre- 
vailing at harvest time; they base their supply decision on the expected price (jf). The 






18 CHAPTER 1 DIFFERENCE EQUATIONS 



actual quantity produced depends on the planned quantity b + 0p t plus a random sup- 
ply shock £ v Once the product is harvested, market equilibrium requires that the 
quantity supplied equal the quantity demanded. Unlike the actual market for wheat, 
the model does not allow for the possibility of storage. The essence of the cobweb 
model is that farmers form their expectations in a naive fashion; let farmers use last 
year’s price as the expected market price 

P*t = /’,-] (1.38) 

Point E in Figure 1.3 represents the long-run equilibrium price and quantity com- 
bination. Note that the equilibrium concept in this stochastic model differs from that 
of the traditional cobweb model. If the system is stable, successive prices will tend to 
converge to point E. However, the nature of the stochastic equilibrium is such that the 
ever-present supply shocks prevent the system from remaining at E. Nevertheless, it is 
useful to solve for the long-run price. If we set all values of the sequence equal to 
zero, set/;, =/?,_, = ...=/?, and equate supply and demand, the long-run equilibrium 
price is given by p = (a - b)l{~frp). Similarly, the equilibrium quantity (.v) is given by 
s = ( a/3 + 7 b)/(y + ,8). 

To understand the dynamics of the system, suppose that farmers in 1 plan to pro- 
duce the equilibrium quantity s. However, let there be a negative supply shock such 
that the actual quantity produced turns out to be s,. As shown by point 1 in Figure 1 .3. 
consumers are willing to pay /;, for the quantity s t \ hence, market equilibrium in / 
occurs at point 1. Updating one period allows us to see the main result of the cobweb 
model. For simplicity, assume that all subsequent values ol the supply shock are zero 
(he., e f+1 = e l+ 2 = ... = 0). At the beginning of period t + 1, farmers expect the price at 
harvest time to be the price of the previous period; thus /;' M = Accordingly, they 
produce and market quantity .r ;+ | (see point 2 in the figure); consumers, however, are 



Price 






0 s t s t* i Quantity 




FIGURE 1.3 The Cobweb Model 






THE COBWEB MODEL 19 



willing to buy quantity x /+ i only if the price falls to that indicated by p (+1 (see point 3 
in the figure). The next period begins with farmers expecting to be at point 4. The 
process continually repeats until the equilibrium point E is attained. 

As drawn, Figure 1.3 suggests that the market will always converge to the long- 
run equilibrium point. This result does not hold for all demand and supply curves. To 
formally derive the stability condition, combine (1.35) through (1.38) to obtain 

b + 0p,_ x + e, = a - ~p, 



p, - (~0/i)p,_\ + (a - b)l 7 - e/j ( 1 . 39 ) 

Clearly, (1.39) is a stochastic first-order linear difference equation with constant 
coefficients. To obtain the general solution, proceed using the four steps listed at the 
end of the last section: 

1 . Form the homogeneous equation: p, = (-/?/ 7 )p,_, . In the next section you 
will learn how to find the solution(s) to a homogeneous equation. For now, 
it is sufficient to verify that the homogeneous solution is 

?] = A(-0l-y) 1 

where A is an arbitrary constant. 

2 . If the ratio ^7 is less than unity, you can iterate (1.39) backward from/7 to 
verify that the particular solution for the price is 

n — h \ 00 

(1.40) 

If/?/ 7 > I, the infinite summation in (1.40) is not convergent. As dis- 
cussed tn the last section, it is necessary to impose an initial condition on 
(1.40) if /3(y> 1. 

3 . The general solution is the sum of the homogeneous and particular solu- 
tions, combining these two solutions, the general solution is 

a — b 1 00 

= rr/ - ~ £ < -~ / 3 / 1')' £ t-> + A (-(?/ 7)' (i . 4 1 ) 

1 ■ ' ;=0 

4. In (1.41), A is an arbitrary constant that can be eliminated if we know the 
price in some initial period. For convenience, let this initial period have a 
lime subscript of zero. Since the solution must hold for every period, 
including period zero, it must be that case that 

Po = J~-^T,(-0hU-i+A{-0l 7 )° 

Since (-01 -yf 1 = 1 , the value of A is given by 

a - b 1 ™ . 

■ . A ~Po 773 + ~T^~^ / T) £-i 

7 + 13 7,tC 







20 CHAPTER 1 DIFFERENCE EQUATIONS 



Substituting this solution for A back into (1.41) yields 



£ <-i p„- 

7 + 0 7J^ 0 7 



I 1 OU 

Ew'rft, 

7 + /? 7 ,_ 0 



and after simplification of the two summations 

P , -!+(--)' 0.42) 

7 + /? 7“o 7 7 + ri 

We can interpret (1.42) in terms of Figure 1.3. In order to focus on the stability of 
the system, temporarily assume that all values of the {e,} sequence are zero. 
Subsequently, we will return to a consideration of the effects of supply shocks. If the sys- 
tem begins in long-run equilibrium, the initial condition is such that p 0 = (a- b)/( 7 + ft). 
In this case, inspection of equation (1 .42) indicates that p t = (a - b)/(y+ 0). Thus, if wc 
begin the process at point E, the system remains in long-run equilibrium. Instead, sup- 
pose that the process begins at a price below long-run equilibrium: p 0 < (a - b)/( 7+ /? ). 
Equation (1.42) tells us that p^ is 

Pi = {a~b)/(y+0) + \p o -(a-b)/('y + 0)]{-0/i) ] (1.43) 

Since p 0 < (a - b)I( 7+ 0) and -/3/y< 0, it follows that p\ will be above the long- 
run equilibrium price ( a - b)/( 7 + 0). In period 2 

p 2 = (a~ b)/{ y+ 0) # [p 0 - (a - b)/{ 7 + 0)] 00/ y) 2 

although p 0 < ( a-b)/(y+ 0), 00/ y) 2 is positive; hence, p 2 is below the long-run equi- 
librium. For the subsequent periods, note that 00/ 7) 1 will be positive for even values 
of t and negative for odd values of t. Just as we found graphically, the successive val- 
ues of the {p,} sequence will oscillate above and below the long-run equilibrium 
price. Since (07)’ goes t0 zero if 7and explodes if 0> 7, the magnitude ot ply 
determines whether the price actually converges toward the long-run equilibrium. If 
0 7< 1, the oscillations will diminish in magnitude, and if 01 7 > 1, the oscillations 
will be explosive. 

The economic interpretation of this stability condition is straightforward. The 
slope of the supply curve (i.e., dp t lds t ) is M0 and the absolute value of the slope of 
the demand curve [i.e., -dp t Id(d0\ is 1/7. If the supply curve is steeper than the 
demand curve, \I0> l/70t 0/y< 1 so that the system is stable. This is precisely the 
case illustrated in Figure 1.3. As an exercise, you should draw a diagram with the 
demand curve steeper than the supply curve and show that the price oscillates and 
diverges from the long-run equilibrium. 

Now consider the effects of the supply shocks. The contemporaneous effect of a 
supply shock on the price of wheat is the partial derivative of p, with respect to ep 
from (1.42): 

111 = _ I (1 .44) 

£>0 7 

Equation (1.44) is called the impact multiplier since it shows the impact etfect 
of a change in £, on the price in /. In terms of Figure 1 .3, a negative value off, implies 



THE COBWEB MODEL 21 



a price above the long-run pricep; the price in / rises by l/7units for each unit decline 
in current period supply. Of course, this terminology is not specific to the cobweb 
model; in terms of the «th-order model given by (1.10), the impact multiplier is the 
partial derivative ofy, with respect to the partial change in the forcing process. 5 

The effects of the supply shock in I persist into future periods. Updating (1 .42) 
by one period yields the one-period multiplier: 



d P , + 1 
9e, 



- — (-/?/ 7) 
7 



Point 3 in Figure 1 .3 illustrates how the price in 1 + I is affected by the nega- 
tive supply shock in /. It is straightforward to derive the result that the effects of 
the supply shock decay over time. Since 01 y< I, the absolute value o CDp^ik, 
exceeds dp l+ pds,. All of the multipliers can be derived analogously; updating 
( 1 .42) by two periods: 



Opi+POs, = -(\/y)00/y) 2 



and after n periods: 






I lie time path ol all such multipliers is called the impulse response function. 
This function has many important applications in time-series analysis because it 
shows how the entire time path of a variable is affected by a stochastic shock. I lore, 
the impulse function traces the effects of a supply shock in the wheat market. In other 
economic applications, you may be interested in the time path of a money supply 
shock or a productivity shock on real GDP. 

In actuality, the function can be derived without updating (1.42) because it is 
always the case that 

Or, 

^ 6/ & Si- j • • 



To find the impulse response function, simply find the partial derivative of ( 1 .42) 
with respect to the various s t _j. These partial derivatives are nothing more than the 
coefficients of the {s,_j) sequence in ( 1 .42). 

Each of the three components in ( 1 .42) has a direct economic interpretation. The 
deterministic portion of the particular solution {a - b)/(y+ J) is the long-run equi- 
librium price; if the stability condition is met, the {p,} sequence tends to converge 
to this long-run value. The stochastic component of the particular solution captures 
the short-run price adjustments due to the supply shocks. The ultimate decay of the 
coefficients of the impulse response function guarantees that the effects of changes 
in the various f, arc ol a short-run duration. The third component is the expression 
( -0/y)'A = (-0h)'\p {) - (a - />)/(7+ /-!)]. The value of A is the initial period's devia- 
tion of the price from its long-run equilibrium level. Given that 0/y< 1. the impor- 
tance of this initial deviation diminishes over time. 






- 

I 



22 CHAPTER 1 DIFFERENCE EQUATIONS 



6. SOLVING HOMOGENEOUS 

DIFFERENCE EQUATIO NS 

Higher-order difference equations arise quite naturally in economic analysis. Equation 
(1.5) — the reduced-form GDP equation resulting from Samuelson’s (1939) model — is 
an example of a second-order difference equation. Moreover, in time-series economet- 
rics it is quite typical to estimate second and higher-order equations. To begin our 
examination of homogeneous solutions, consider the second-order equation 

y, - a\y,-\ - = 0 ( 1 . 45 ) 

Given the findings in the first-order case, you should suspect that the homogeneous 
solution has the form = An 1 . Substitution of this trial solution into (1 .45) yields 

A a! - ci\A o/" 1 - a 2 A o' -2 = 0 (1 .46) 

Clearly, any arbitrary value of A is satisfactory. If you divide ( 1 .46) by An'- 2 , the 
problem is to find the values of n that satisfy 

or — a [O' -r/2 = 0 (1.47) 

Solving this quadratic equation — called the characteristic equation — yields 
two values of a called the characteristic roots. Using the quadratic formula, we find 
that the two characteristic roots are 



= («,±Vrf)/2 (1.48) 

where d is the discriminant [(£?|) 2 + 4« 2 ], 

Each of these two characteristic roots yields a valid solution for (1.45). Again, 
these solutions are not unique. In fact, for any two arbitrary constants A , and A 2 , the 
linear combination /* ,(£»•,)' + A 2 (n 2 y also solves (1.45). As proof, simply substitute 
y, = A i (o'| y + A 2 (o 2 y into (1.45) to obtain 

A | ( ( q y + A 2 (n- 2 y - £7|[/li(£i'|) ,_l + A 2 (o 2 Y '] + £7 2 [/l 1 ( £F| y - + A 2 (n 2 y -] 

Now, regroup terms as follows: 

A ,[(«,)' - £ji(n:|) ,_l - ct 2 { fV| )'~ 2 ] + /( 2 [(a 2 y - c7|(« 2 y _l - a 2 (a- 2 )' _2 ] = 0 

Since fi-| and <\ 2 each solve (1.45), both terms in brackets must equal zero. As 
such, the complete homogeneous solution in the second-order case is 

vf =A,(a,)' +A 2 (a 2 / 

Without knowing the specific values of and o 2 , we cannot find the two char- 
acteristic roots n, and a 2 . Nevertheless, it is possible to characterize the nature of the 
solution; three possible cases are dependent on the value of the discriminant d. 



CASE 1 



If £7| 2 + 4(7-, > 0, d is a real number and there will be two distinct real charac- 
teristic roots, lienee, there are two separate solutions to the homogeneous 






SOLVING HOMOGENEOUS DIFFERENCE EQUATIONS 23 

equation denoted by (cqy and (a 2 )'. We already know that any linear combi- 
nation of the two is also a solution. Hence, 

yf = A 1 (a|) / + A 2 (a 2 ) / 

It should be clear that if the absolute value of either a, or a 2 exceeds unity, 
the homogeneous solution will explode. Worksheet 1.1 examines two second- 
order equations showing real and distinct characteristic roots. In the first example, 
y, = 0.2y,_| + 0.35y,_ 2 , the characteristic roots are shown to be: a ( = 0.7 and a 2 = 
-0.5. Hence, the full homogeneous solution is y'j = A ,(0.7)' + A 2 {~ 0.5)'. Since both 
roots are less than unity in absolute value, the homogeneous solution is conver- 
gent. As you can see in the graph on the bottom left-hand side of Worksheet 1.1, 
convergence is not monotonic because of the influence of the expression (-0.5)'. 

In the second example, y t = 0.7y,_| + 0.35y,_ 2 . The worksheet indicates how 
to obtain the solution for the two characteristic roots. Given that one characteristic 
root is (1.037)', the {y,\ sequence explodes. The influence of the negative root (ev 2 
= -0.337) is responsible for the non-monotonicity of the time path. Since (-0.337)' 
quickly approaches zero, the dominant root is the explosive value 1 .037. 

CASE 2 

If £7| 2 + Aa 2 = 0, it follows that d = 0 and aq = a 2 = £7|/2. Hence, a homoge- 
neous solution is £7|/2. However, when d= 0, there is a second homogeneous 
solution given by l[c / ( /2)'. To demonstrate that y'j = t{ci\ / 2)' is a homogeneous 
solution, substitute it into (1.45) to determine whether 

7(7, ,/2)' - «,[( 7 l)(0|/2)' l] - £7 2 [(/-2)(£7,/2)'- 2 ] = 0 

Divide by (£7[/2)'~ 2 and form 

— [(fl| 2 / 4) + fl 2 ]/ +[(flf /2) + 2« 2 ] = 0 

Since wc are operating in the circumstance where a] + 4l 7 2 = 0, each brack- 
eted expression is zero; hence, I(a\l2)’ solves (1.45). Again, for arbitrary con- 
stants A | and A 2 , the complete homogeneous solution is 

}’!' =A l (« l /2)'+A 2 r(« 1 /2)' 

Clearly, the system is explosive if |z7] | > 2. If |<?| | < 2, the term A](o|/2)' 
converges, but you might think that the effect of the term /(£7(/2)' is ambigu- 
ous [since the diminishing (£7|/2)' is multiplied by /]. This ambiguity is correct 
in the limited sense that the behavior of the homogeneous solution is not 
monotonic. As illustrated in Figure 1 .4 for r?|/2 = 0.95, 0.9, and -0.9, as long 
as jr/| | < 2, lim[ /(<sr j/2)' ] is necessarily zero as t — > oo; thus, there is always 
convergence. For 0 < £7, < 2, the homogeneous solution appears to explode 
before ultimately converging to zero. For -2 < £7| < 0, the behavior is wildly 
erratic; the homogeneous solution appears to oscillate explosively before the 
oscillations dampen and finally converge to zero. 







WORKSHEET 



SECOND-ORDER EQUATIONS 



Example 1: y t = 0.2y,_, + 0.35y,_2. Hence: a\ = 0.2 and o 2 — 0-35 

Form the homogeneous equation: y, - 0.2y,_, - 0.35y,_ 2 = 0 
A check of the discriminant reveals: d =« f + 4« 2 so that = 1.44. Given that 
d > 0, the roots will be real and distinct. 

Let the trial solution have the form: y, = a'. Substitute the trial solution into 
the homogenous equation to obtain: of - 0.2a ,_l - 0.35<Y~ 2 = 0 
Divide by o!~ 2 to obtain the characteristic equation: or - Old - 0.35 tv = 0 
Compute the two characteristic roots: 

a, = 0.5(cr, + d' 12 ) a 2 = 0.5(0, - d in ) 

a:, =0.7 07 = -0.5 

The homogeneous solution is: A ,(0.7)' + /l 2 (-0.5)'. The first graph shows 
the time path of this solution for the case in which the arbitrary constants 
equal unity and I runs from 1 to 20. 



Example 2: y, - 0 . 7 V/_ 1 + 0.35y,_ 2 . Hence: o, = 0.7 and an = 0.35 




Form the homogeneous equation: y, - 0 . 7y,„ j - 0.35_)’,„ 2 = 0 
A check of the discriminant reveals: d= «i + 4 a 2 so thatri= 1.89. Given that 
d> 0, the roots will be real and distinct. 

Form the characteristic equation of - 0.7 a' -1 - 0.35af~ 2 ~ 0 

Divide by d~ 2 to obtain the characteristic equation: or - 0.7a- - 0.35a- = 0 

Compute the two characteristic roots: 

a, = 0.5(o, + d' 12 ) or, = 0.5(a, - d' n ) 

ai = 1.037 an =-0.337 

The homogeneous solution is: A |( 1 .037)' + /t 2 (-0.337)'. The second graph 
shows the time path of this solution for the case in which the arbitrary con- 
stants equal unity and / runs from 1 to 20. 




SOLVING HOMOGENEOUS DIFFERENCE EQUATIONS 25 



8 

6 

M 0.95') 4 
tj 0.90 

2 

0 



1 

i 

a 

-3 

1 
I 

s 

| 
I 
I 

a 
3 

I 

i 

3 

| 

I 

0 20 40 60 80 100 | 

f r. | 

.......... w. 9 

FIGURE 1-4 The Homogeneous Solution t - (a 1 ) f | 

CASE 3 

If a, 2 + 4 a 2 < 0, it follows that d is negative so that the characteristic roots are 
imaginary. Since a, 2 > 0, imaginary roots can occur only if a 2 < 0. Although 
this might be hard to interpret directly, if we switch to polar coordinates it is 
possible to transform the roots into more easily understood trigonometric func- 
tions. The technical details are presented in Appendix 1.1, For now, write the 
two characteristic roots as 

a \ = ( a \ +i'J—d)l2 a 2 = (a, — iipd)!2 

where i = \/-\. 

i 

As shown in Appendix 1.1, you can use de Moivre’s theorem to write the f 

homogeneous solution as 

yf =/?/ cos {&i+0 2 ) 






(1.49) 








26 CHAPTER 1 DIFFERENCE EQUATIONS 

where /?, and are arbitrary constants, r = (-a 2 ) l/2 , and the value of (9 is cho- 
sen so as to satisfy 

cos(i 9) = <aj/[2(— o 2 ) 1/2 j (1 .50) 

The trigonometric functions impart a wavelike pattern to the time path of 
the homogeneous solution; note that the frequency of the oscillations is deter- 
mined by 9. Since cos (ft) = cos( 27 t + 6t), the stability condition is determined 
solely by the magnitude of r = (-a 2 ) 1/2 . If |a 2 | = 1, the oscillations are of 
unchanging amplitude; the homogeneous solution is periodic. The oscillations 
will dampen if |a 2 | < 1 and explode if \a 2 \ > I • 

Example: It is worthwhile to work through an exercise using an equation with 
imaginary roots. The left-hand side of Worksheet 1.2 examines the behavior of 
the equation/, = 1.6/,_, - 0.9/,_ 2 . A quick check shows that the discriminant cl 
is negative so that the characteristic roots are imaginary. If we transform to polar 
coordinates, the value of r is given by (0.9) 1/2 = 0.949. From (1.50), cos(f) = 
1 .6/(2* 0.949) = 0.843. You can use a trig table or a calculator to show that 0= 
0.567 (i.e., if cos(i 9) = 0.843, 0= 0.567). Thus, the homogeneous solution is 

y h t — ,8\ (0.949/ cos(0.567r +/3 2 ) (1.51) 

The graph on the left-hand side of Worksheet 1 .2 sets //, = I and /A = 0 and 
plots the homogeneous solution for t = 1, ..., 30. Example 2 uses the same 
value of a 2 (hence, r - 0.949) but sets a, = -0.6. Again, the value of d is nega- 
tive; however, for this set of calculations, cos (9) = -0.316 so that 0 is 1.89. 
Comparing the two graphs, you can see that increasing the value of facts to 
increase the frequency of the oscillations. 

Stability Conditions 

The general stability conditions can be summarized using triangle ABC in Figure 
1.5. Arc A0B is the boundary between Cases 1 and 3; it is the locus of points where 
cl= a | 2 + 4« 2 = 0. The region above AQB corresponds to Case 1 (since d> 0), and (he 
region below AQB corresponds to Case 3 (since d < 0). 

In Case 1 (in which the roots are real and distinct), stability requires that the 
largest root be less than unity and the smallest root be greater than — 1. The largest 
characteristic root, or, = (a^+\/d)/2, will be less than unity if 

£7] + (a, 2 + 4a 2 ) ,/2 < 2 or (a, 2 + 4a 2 ) l/2 < 2 - a, 

l ienee, a | 2 + 4a 2 < 4 - 4a, + a, 2 

or \; v: 

a, + a 2 < 1 (1 .52) 

The smallest root, a 2 = (a, - \/d)/2, will be greater than minus one if 
a, - (a, 2 + 4a 2 ) l/2 > -2 or 2 + a, > (a, 2 + 4a 2 ) l/2 



SOLVING HOMOGENEOUS DIFFERENCE EQUATIONS 27 





Example 1 Example 2 

y, - 1 -6//-1 + 0.9y,_ 2 y, + 0.6/, + 0.9/, _ 2 

a) Check the discriminant d = (a ,) 2 + 4a 2 

d= (1.6) 2 + 4(— 0.9) d = (-0.6) 2 + 4(-0.9) 

= -1.04 =-3.24 

Hence, the roots are imaginary. The homogeneous solution has the form 

y, =0/cos(8t + p 2 ) 

where f)\ and /A are arbitrary constants. 

b) Obtain the value of r = (-a;>) 1/2 

r = (0.9) l/2 r = (0.9) 1/2 ' ..... 

= 0.949 = 0.949 

c) Obtain O from cos (O) = «,/|2(— <r i) ,/2 | 

cos(tf) = 1 .6/[2(0.9) 1/2 ] cos(0) = -0.6/[2(0.9) I/2 ] 

= 0.843 =-0.316 

Given cos(f), use a trig-table to find 6 

0 = 0.567 (9=1.89 

d) Form the homogeneous solution: yj' = /3,r f cos((9f + (3 2 ) 

,vf =/?,(0.949)'cos(0.567/ + /%) >’? = /?, (0.949)' cos(l. 89/ + /%) 

For = 1 and f3 2 ~ 0, the time paths of the homogeneous solutions are 










Hence: 4 + 4a, + a, 2 > a, 2 + 4a 2 
or 

a 2 <l+a ] (1.53) 

Thus, the region of stability in Case 1 consists of all points in the region bounded 
by AOBC. For any point mAOBC, conditions (1.52) and (1.53) hold and d> 0. 

InCase 2 (repeated roots), a, 2 + 4a 2 = 0. The stability condition is |a,| <2. Thus, 
the region of stability in Case 2 consists of all points on arc AOB. In Case 3 (d < 0), 
the stability condition is r = (-a 2 )' /2 < 1. Hence 

-a 2 < 1 (where a 2 < 0) (1.54) 

Thus, the region of stability in Case 3 consists of all points in region AOB. For 
any point in AOB, (1.54) is satisfied and d < 0. 

A succinct way to characterize the stability conditions is to state that the charac- 
teristic roots must lie within the unit circle. Consider the semicircle drawn in Figure 
1.6. Real numbers are measured on the horizontal axis and imaginary numbers are 
measured on the vertical axis. If the characteristic roots cv, and a 2 are both real, they 
can be plotted on the horizontal axis. Stability requires that they lie within a circle of 
radius one. Complex roots will lie somewhere in the complex plane. If a, > 0, the 
roots a, = (a, + i\/d)!2 and a 2 = (a, - /V d)/2 can be represented by the two points 
shown in Figure 1.6. For example, a, is drawn by moving a,/2 units along the real 
axis and s/dt. 2 units along the imaginary axis. Using the distance formula, the length 
of the radius r is given by 

r — 

and, using the fact that i 2 = - 1 , we obtain 



r = (-a,yn 



SOLVING HOMOGENEOUS DIFFERENCE EQUATIONS 29 



Imaginary 




FIGURE 1.6 Characteristic Roots and the Unit Circle 



The stability condition requires that /■ < 1 . Therefore, when plotted on the com- 
plex plane, the two roots n, and <t-> must lie within a circle of radius equal to unity. 
In the time-series literature it is simply stated that stability requires that all charac- 
teristic roots lie within the unit circle. 

Higher-Order Systems 

The same method can be used to find the homogeneous solution to higher-order dif- 
ference equations. The homogeneous equation for (1.10) is 



y,-Yj‘i>'t-i =0 ( 1 . 55 ) 

i'=i 

Given the results in Section 4, you should suspect each homogeneous solution to 
have the form v* = /I n' where A is an arbitrary constant. Thus, to find the value(s) of 
n; we seek the solution for 

n 

An 1 = 0 (1.56) 

or, dividing through by n ,_ ", we seek the values of tv that solve 



This will-order polynomial will yield n solutions for n: Denote these n character- 
istic roots by <-v, . n 2 , n„. Given the results in Section 4, the linear combination 

/1,0|' + A 2 n-f + ... + A n <\„‘ is also a solution. The arbitrary constants /I, through A n 
can be eliminated by imposing n initial conditions on the general solution. The <\ [ 
may be real or complex numbers. Stability requires that all real valued n ( be less than 
unity in absolute value. Complex roots will necessarily come in pairs. Stability 
requires that all roots lie within the unit circle shown in Figure 1 . 6 . 








30 CHAPTER 1 DIFFERENCE EQUATIONS 




In most circumstances there is little need to directly calculate the characteristic 
roots of higher-order systems. Many of the technical details are included in Appendix 
1.2 to this chapter. However, there are some useful rules for checking the stability fj 

conditions in higher-order systems. I 

1. In an «th-order equation, a necessary condition for all characteristic roots 
to lie inside the unit circle is 

;=i 

2. Since the values of the a , can be positive or negative, a sufficient condition 
for all characteristic roots to lie inside the unit circle is 

£kl<i 

t=i 

3. At least one characteristic root equals unity if 

I> = > 

1=1 

Any sequence that contains one or more characteristic roots that equal 
unity is called a unit root process. 

4. For a third-order equation, the stability conditions can be written as 

: ; 1 • a | - ti2 •• r/j > 0 

1 + ( 7 [ -■ a 2 + > 0 

1 - £/ 1 1»3 + a 2 - uy > 0 

3 + + a 2 ~ 3<7 3 > 0 or 3 - + a -> + 3u 3 > 0 

Given that the first three inequalities are satisfied, either of the last 
two can be checked. One of the last conditions is redundant, given that the 
other three hold. 

7. PARTICULAR SOLUTIONS FOR 

DETE RMINISTIC P ROCESSES 

Finding the particular solution to a difference equation is often a matter of ingenuity 
and perseverance. The appropriate technique depends heavily on (he form of the {x,} 
process. We begin by considering those processes that contain only deterministic 
components. Of course, in econometric analysis, the forcing process will contain both 
deterministic and stochastic components. 3 

CASE 1 

.v, = 0. When all elements of the J.v,} process are zero, the difference equa- 
tion becomes ; 



= a o + “iTf-i + °2.'V-2 + ••■ + o„y 



0.58) 






PARTICULAR SOLUTIONS FOR DETERMINISTIC PROCESSES 31 

Intuition suggests that an unchanging value of y (i.e., y, = y,_\ = ... = c) 
should solve the equation. Substitute the trial solution y t = c into (1.58) to obtain 

c = <7 0 + o |t* + a 2 c + ••• + a„c 

so that 

c = « 0 /(l - a 2 - ... -a„) (1-59) 

As long as ( 1 - - a 2 - ... - a„) does not equal zero, the value of c given 

by (1 .59) is a solution to (1 .58). Hence, the particular solution to ( 1 .58) is given 
by y,P = a 0 /( 1 - ci { - a 2 ~ ■ ■ ■ - a„). 

If 1 - ci[ - cij - ... - a„ = 0, the value of c in (1 .59) is undefined; it is nec- 
essary to try some other form for the solution. The key insight is that [y ,} is a 

unit root process if Ea ( - = 1 . Since ly , } is not convergent, it stands to reason that 
the constant solution does not work. Instead, recall equations (1.12) and (1.26); 
these solutions suggest that a linear time trend can appear in the solution of a 
unit root process. As such, try the solution: y r P = ct. For ct to be a solution it 
must be the case that 

cl = - 1 ) + - 2) + ... + a n c(l - n) 

or, combining like terms, 

( ! — cj | — cj 2 — — a„)cl = oq - c(o | + 2 a 2 + 3a 3 + . . . + na„) 

Since 1 - a, - a 2 - ... - a n 0, select the value of c such that 
c = cif/(a i + 2 a 2 + 3 a 2 + . . . + na n ) 

For example, let 

y, = 2 + 0.75.)-, + 0.25j', _ 2 

Here, a | = 0.75 and a 2 = 0.25; {j>,} is a unit root process because 0 [ + a 2 
= 1 . The particular solution has the form ct, where c = 2/[0.75 + 2(0.25)] =1.6. 
In the event that the solution ct fails, sequentially try the solutions: y/’ - ct~, 

. ct-. ... , ct”. For an nth-order equation, one of these solutions will always be the 
particular solution. 

CASE 2 

The Exponential Case. Leix, have the exponential form b(d) r/ . where b, d, and 
rare constants. Since r has the natural interpretation as a growth rate, we would 
expect to encounter this type of forcing process case in a growth context. We 
illustrate the solution procedure using the first-order equation 

y, = Oq + + bd rl (1 .60) 

To try to gain an intuitive feel for the form of the solution, notice that if b 
-- 0, (1.60) is a special case of (1.58). Hence, you should expect a constant to 
appear in the particular solution. Moreover, the expression d r< grows at the 
constant rate r. Thus, you might expect the particular solution to have the form 
yf = c<) + c \d", where q, and c, are constants. If this equation is actually a 







32 CHAPTER 1 DIFFERENCE EQUATIONS 



solution, you should be able to substitute it back into ( 1 .60) and obtain an iden- 
tity. Making the appropriate substitutions, we get 

c 0 + c\d rl = a 0 + q[c 0 + c,^'-')] + bcf (1.61) 

For this solution to work, it is necessary to select c 0 and q such that 
c 0 = </,/( 1 - a,) and q = [bd r ]/(d r - «,) 

Thus, a particular solution is 

" l-o, d'-a, 

The nature of the solution is that v/’ equals the constant a 0 /( 1 - a,) plus an 
expression that grows at the rate r. Note that for \d r \ < 1 , the particular solution 
converges to q/O - q). 

If either q = 1 or q = d‘\ use the trick suggested in Case 1. If q = 1, try 
the solution c () = cl, and if a, = d\ try the solution q = tb. Use precisely the 
same methodology in higher-order systems. 

CASE 3 

Deterministic Time Trend. In this case, let the {.v,} sequence be represented by 
the relationship x, = bl d where b is a constant and c! is a positive integer. I lence 

n 

y, = a 0 + Y^ a i »-,■ + bt ‘‘ ( 1 .62) 

i=l 

Since y, depends on b 1 , it follows that y, , depends on (t-lKy,_ 2 depends 
on (/- 2) d , and so on. As such, the particular solution has the form y r P = c 0 + c,t 
+ c 2 l 2 + ... + c d t d . To find the values of the q, substitute the particular solu- 
tion into (1.62). Then select the value of each c, that results in an identity. 
Although various values of d are possible, in economic applications it is com- 
mon to see models incorporating a linear time trend ( d= 1 ). For illustrative pur- 
poses, consider the second-order equation y, = <i {) + qv, , -f + hi. Posit 
the solution y,P = c 0 + q/ where c () and q arc undetermined coefikicnts. 
Substituting this “challenge solution” into the second-order difference equation 
yields 

Co + C|/ = a 0 + a] [co+c,(/- U] + q[c 0 + q (/ - 2)] + bl (1.63) 

Now select values of c 0 and c, so as to force equation (1 .63) to be an iden- 
tity for all possible values of t. If we combine all constant terms and all terms 
involving /, the required values of c 0 and q are 

q ” b/{ 1 — cl | — q>) 
c () = [«o - ( 2a 2 + a >)C ]1 / (I - a, - a 2) 

so that 

c 0 = [°</0 - a \ ~ a 2 )l ~ [A/( 1 - q - a 2 ) 2 ](2a 2 + q) 




THE METHOD OF UNDETERMINED COEFFICIENTS 33 



Thus, the particular solution will also contain a linear time trend. You ?! 

should have no difficulty foreseeing the solution technique if q + a 2 = 1 • In this ! 

circumstance — which is applicable to higher order cases, as well — try multiply- J 

ing the original challenge solution by I. I 

8. T HE METHOD OF UNDETERMINED COEFFICIENT S 

At this point, it is appropriate to introduce the first of two useful methods for finding 
particular solutions when there are stochastic components in the [y t } process. The 
key insight of the method of undetermined coefficients is that the particular solu- 
tion to a linear difference equation is necessarily linear. Moreover, the solution can 
depend only on time, a constant, and the elements of the forcing process { x , }. Thus, 
it is often possible to know the exact form of the solution even though the coefficients 
of the solution arc unknown. The technique involves positing a solution — called a 
challenge solution — that is a linear function of all terms thought to appear in the 
actual solution. The problem becomes one of finding the set of values for those unde- 
termined coefficients that solve the difference equation. 

The actual technique for finding the coefficients is straightforward. Substitute the 
challenge solution into the original difference equation and solve for the values of the 
undetermined coefficients that yield an identity for all possible values of the included 
variables. If it is not possible to obtain an identity, the form of the challenge solution 
is incorrect. Try a new trial solution and repeat the process. In fact, we used the method 
of undetermined coefficients when positing the challenge solutions yf =c 0 + c^ct' and 
yf = c 0 + q/ for Cases 2 and 3 in Section 7. 

To begin, reconsider the simple first-order equation y t = + a |>V I + e r Since 

you have solved this equation using the iterative method, the equation is useful for 
illustrating the method of undetermined coefficients. The nature of the {y ( } process is 
such that the particular solution can depend only on a constant term, time, and the 
individual elements of the {s,} sequence. Since t does not explicitly appear in the 
forcing process, t can be in the particular solution only if the characteristic root is 
unity. Since the goal is to illustrate the method, posit the challenge solution: 

oo 

>’, = /, 0 +b\t + ^2 a i s i-i (1.64) 

/= 0 

where b 0 , b x , and all the q are the coefficients to be determined. 

Substitute (1.64) into the original difference equation to form 

b Q + b ] t+ apf, + + a 2 £,_ 2 + ... 

= a 0 + a x [b 0 + b x (t- 1)+ oq£,_ x + a, e,_ 2 + ■■■} + £, 

Collecting like terms, we obtain 

(b[)-a 0 -a x b 0 + a ] b ] ) + b } (\ -q)f + («0- \)e, 

+ (n-j - q <T{))£, i + (a 2 - a | q )£ t - 2 + ( a 3 a i a 2^ s r~3 + ~ 0 (1.65) 







34 CHAPTER 1 DIFFERENCE EQUATIONS 



Equation (1.65) must hold for all values of t and all possible values of the {e,} 
sequence. Thus, each of the following conditions must hold: 

q 0 - 1 =0 

a | - fl|Qf 0 = 0 

n’2 • • « | o | ' 0 



b 0 -a 0 -a l b Q + a ] b l =0 

b | - fli^i = 0 

Notice that the first set of conditions can be solved for the recursively. The 
solution of the first condition entails setting cv 0 = 1. Given this solution for rv 0 , the 
next equation requires aq = ay Moving down the list, a 2 = a x a.\ or a 2 = a, 2 . 
Continuing the recursive process, we find a,- = Now consider the last two equa- 
tions. There are two possible cases depending on the value of a,. If a, * |, it 
immediately follows that iq = 0 and = Oq/( 1 - <7|). For this case, the particular 
solution is 

J ~"l i=0 

Compare this result to (1.21); you will see that it is precisely the same solution 
found using the iterative method. The general solution is the sum of this particular 
solution plus the homogeneous solution Aa\. Hence, the general solution is 

a 00 

- v r = 7^“ + XVl £ r-> + Aa \ 

'“"I /=() ... 



Now, if there is an initial condition fory 0 , it follows that 

1 1=0 

Combining these two equations so as to eliminate the arbitrary constant A, we obtain 



v, - ~~ + Y a 'i £,-i + a[ >o- a 0 /(I -a, ) ~Y a \ e_ ; 

1 (=0 1=0 

so that 

a Q v-i • [ 

>7 - T~~ + Y a \ l £ t-i + a\ >’o ~ a 0 /(l -fl, ) ( 1 .66) 

> 1=0 L 

It can be easily verified that (l .66) is identical to (l .25). Instead, if a, = 1, b 0 can 
be any arbitrary constant and b ] = a 0 . The improper form of the solution is 

OO 

>’/ = h 0 + V + Yj'-i 






THE METHOD OF UNDETERMINED COEFFICIENTS 35 

The form of the solution is “improper” because the sum of the {£,} sequence may 
not be finite. Therefore, it is necessary to impose an initial condition. If the value 
is given, it follows that 

CO 

>'o = b 0 + J2 £ -i 

1=0 

Imposing the initial condition on the improper form of the solution yields (1.26) 

t 

y, = y 0 + a ot + Y £ i 

i=i 

To take a second example, consider the equation 

y, = a 0 + a t y,_ , + £,+ 0\£,_\ (1-67) 

Again, the solution can depend only on a constant, the elements of the (e,| 
sequence, and t raised to the first power. As in the previous example, t does not need 
to be included in the challenge solution if the characteristic root differs from unity. To 
reinforce this point, use the challenge solution given by (1.64). Substitute this tenta- 
tive solution into (1.67) to obtain 

OO OO 

b o + b \ t + £ '-i = fl o + a \ [bo + b\0~V + Y a > £ /-i-/ 1 + £ t + P\ £ >- 1 

1=0 1=0 

Matching coefficients on all terms containing e,, e : _ 2 , ..., yields 

Q 0= 1 

■■ > ■ «, - - ,7, [so that = a, +/?,] 

a 2 = dj a, [so that a 2 = «|(d| + 0\)] 

a 3 = d| a 2 [so that a 3 = + 0 { )\ 

aj = a\a;_\ [so that + 00] 

Matching coefficients of intercept terms and coefficients of terms containing t, 
we get 

b 0 = a 0 + a\b 0 -a\b l 

h \ ==a \b\ 

Again, there are two cases. If Oj ^ 1, then b | = 0 and b 0 = ag/(l - dj). The par- 
ticular solution is 

o° 

>’/ = ~r^~ + e t + ( a \ +/?i)E«r 1 €t ~ i 

‘~ a i i=i 

The general solution augments the particular solution with the term Aa j ! . You are 
left with the exercise of imposing the initial condition fory 0 on the general solution. 
Now consider the case in which a ( = 1. The undetermined coefficients are such that 
b\ = Oq and is an arbitrary constant. The improper form of the solution is 

OO 

y, = b o + V + £ i -H1+00Y, £ t ~i 

/VI 







36 CHAPTER 1 DIFFERENCE EQUATIONS 




If To is given, it follows that 

OO 

To = b o + £ 0 + (* + "^2 £ -i 

1=1 

Hence, imposing the initial condition, we obtain 

i-l 

y,=yo +a o' + £ i +(• 

i=i 

Higher-Order Systems 

The identical procedure is used for higher-order systems. As an example, let us find 
the particular solution to the second-order equation 

}’, = a 0 + a \y <- 1 + a Z v t-2 + £ , (1.68) 

Since we have a second-order equation, we use the challenge solution 
y, = b 0 + bf + b 2 t 2 + a 0 £, + <*,£,_! + a 2 £ t _ 2 + ... 
where 6 0 , b j, b 2 , and the a, are the undetermined coefficients. 

Substituting the challenge solution into (1.68) yields 
[/)(,+/), l+b 2 l 2 } + n ( )f, + ri| f,_| + n 2 £,_2 > ... = -I /;,(/ - 1) -I /,,(? 1)2 

+ «()£■,_, + 2 + 0'2 £ i- 3 + •••] + o 2 [*0 + *!(/ -2) + b 2 (i - 2) 2 

+ q {) £,_ 2 + a, £)_3 + a 2 £,_^ + ...] + £, 

There are several necessary and sufficient conditions for the values of the a/s to 
render the equation above an identity for all possible realizations of the {e,} sequence: 

«0= 1 

Q'l = <T|Ot) [so that a, =a,] 

Q'2 = a i a | + a 2 aq [so that a 2 = (a,) 2 + a 2 ] 

aj = a| a -2 + a 2 n \ [ so that a 2 = (fl|) 2 + 2a x af\ 



Notice that for any value of j > 2, the coefficients solve the second-order differ- 
ence equation aj = + a 2 a j-2- Since we know o-p and 03, we can solve for all 

the Oj iteratively. The properties of the coefficients will be precisely those discussed 
when considering homogeneous solutions: 

1. Convergence necessitates that \a 2 \ < 1, a x + a 2 < 1, and that a 2 - o, < 1. 
Notice that convergence implies that past values of the {s,} sequence ulti- 
mately have a successively smaller influence on the current value of):,. 

2. If the coefficients converge, convergence will be direct or oscillatory if 
(af + 4<7 2 ) > 0 , will follow a sine/cosine pattern if (a, 2 + 4a-,) < 0 , and 
will “explode” and then converge if (a/ + 4a 2 ) = 0. Appropriately setting 
the a,-, we are left with the remaining expression: 

b 2 (\ - a, - a 2 )t 2 + [b]( \ - «| - a 2 ) + 2b 2 (a t + 2 a 2 )]l 3 

[/r 0 (l - a , - a 2 ) - a () + a x (b x - b 2 ) + la 2 (b x - 2 b 2 )] = 0 (1 .69) 



i 

1 

4 



THE METHOD OF UNDETERMINED COEFFICIENTS 37 



Equation (1.69) must equal zero for all values of t. First, consider the case in which 
a] + a 2 &l. Since (1 - a x - a 2 ) does not vanish, it is necessary to set the value of b 2 
equal to zero. Given that b 2 = 0 and that the coefficient of t must equal zero, it follows 
that b x must also be set equal to zero. Finally, given that b\ = b 2 = 0, we must set 
b 0 = af{\ - a, - a 2 ). Instead, if a, + a 2 = 1 , the solutions for the b, depend on the spe- 
cific values of a () , a ( , and a 2 . The key point is that the stability condition for the homo- 
geneous equation is precisely the condition for convergence of the particular solution. If 
any characteristic root of the homogeneous equation is equal to unity, a polynomial time 
trend will appear in the particular solution. The order of the polynomial is the number 
of unitary characteristic roots. This result generalizes to higher-order equations. 

If you are really clever, you can combine the discussion of the last section with 
the method of undetermined coefficients. Find the deterministic portion of the partic- 
ular solution using the techniques discussed in the last section. Then use the method , 

of undetermined coefficients to find the stochastic portion of the particular solution. 

In (1.67), for example, set f, = = 0 and obtain the solution a 0 /( I - a,). Now use 

the method of undetermined coefficients to find the particular solution of)', = c7|)’,_| | 

+ 5, + f,_| . Add the deterministic and stochastic components to obtain all compo- 
nents of the particular solution. i 

A Solved Problem i 

To illustrate the methodology using a second-order equation, augment ( 1 .28) with the ; 

stochastic term e t so that | 

y, = 3 + 0.9y,_, - 0.2)', _ 2 + £, (1.70) 1 

You have already verified that the two homogeneous solutions are zf,(0.5)' and J 

AyQA)' and that the deterministic portion of the particular solution is yf = 10. To J 

find the stochastic portion of the particular solution, form the challenge solution i 

Oi-f I 

i=o j 

In contrast to (1.64), the intercept term /)„ is excluded (since we have already j 

found the deterministic portion of the particular solution) and the time trend /y is j 

excluded (since both characteristic roots are less than unity). For this challenge to | 

work, it must satisfy J 

a 0 £, + <"V|£,_| + opw-2 + ot 3£>_3 + ... = 0.9[ap5,_| + a-|£,_ 2 + n 2 s,_ 2 + 055 , _4 + •••] ; 

-0.2 [a Q s,_ 2 + rv,£,_ 3 + a 2 fy + ny,_ 5 + ■■■} + £ t O- 71 ) 

1 

Since (1 .71 ) must hold for all possible realizations off,, sy, f,_ 2 , . ••, each of the 

following conditions must hold: J 

■ 1 

a,, = 1 I 

a, = 0.9 a 0 j 

so that a | = 0.9, and for all / > 2, 

a, = 0.9n ( | - 0.2a,_ 2 



\ 

( 1 . 72 ) . 3 



38 CHAPTER 1 DIFFERENCE EQUATIONS 




Now, it is possible to solve (1 .72) iteratively so that a 2 = 0.9a, - 0.2rv 0 = 0.61, 
w '3 = 0.9(0.61) - 0. 2(0.9) = 0.369, and so forth. A more elegant solution method is to 
view (1.72) as a second-order difference equation in the {n-,-} sequence with initial 
conditions Oq = 1 and cq = 0.9. The solution to (1.72) is 

a i ~ 5(0.5)' - 4(0.4)' (1.73) 

To obtain ( 1 .73), note that the solution to (1.72) is: n, = Af 0.5)' + A 4 (0A)' where 
and are arbitrary constants. Imposing the conditions n-„ = I and rq = 0.9 yields 
(1.73). If wc use (1.73), it follows that: rr () = 5(0.5)° ■ 4(0.4)° - I; n, -■ 5(0.5)' ... 
4(0.4) 1 = 0.9; 0'2 = 5(0.5)- - 4(0.4)2 = 0.6 1 ; and so on. 

The general solution to (1.70) is the sum of the two homogeneous solutions and 
the deterministic and stochastic portions of the particular solution: 

y,= 10+ A](0.5)' + A 2 (0.4)' + (1.74) 

/ = () 



where the rq are given by ( 1 .73). 

Given initial conditions for y 0 and jq, it follows that A t and A 2 lttUst satisfy 

y 0 = 1 0 + A, + /V 2 + (1.75) 

/ = o 



y,= 10 + A, (0.5) + ^0.4) + ^ s,_,- 



Although the algebra gets messy, ( 1 .75) and ( 1 .76) can be substituted into ( 1 .74) 
to eliminate the arbitrary constants: 

■ A y, = 10 + (0.4)'[5(y () - 10)— 10 (y, — 10)| 

i~2 

+ (0.5)' 1 10( V| - 10) - 4(.y () -10)] + ^ijr ; ..,- 



j 



9. LAG OPERATORS 

If it is not important to know the actual values of the coefficients appearing in the par- 
ticular solution, it is often more convenient to use lag operators rather than the 
method of undetermined coefficients. The lag operator L is defined to be a linear 
operator such that for any value y, 

Li y,=y,-i (1.77) 

Thus, preceding v, simply means to lag y, by i periods. It is useful to remem- 
ber the following properties of lag operators: 

1. The lag of a constant is a constant: Lc = e. 

2. The distributive law holds for lag operators. We can set 
(U + U)y, = Uy, + Uy, = y hj + y hj . 



LAG OPERATORS 39 



3. The associative law of multiplication holds for lag operators. We can set 
L'lJy , = L'(Uy t ) = Vy t _j - y,^_,. Similarly, we can set L'Uy t = L i+ Jy t = 

Note that L°y, - y r 

4. L raised to a negative power is actually a lead operator: Lr‘y t -Vi+e To 
explain, define j = -i and form Uy, - y,_j =y,+j- 

5. For |e/j < I, the infinite sum (1 + aL + a 2 L 2 + a 2 L 2 + ...)y t =y,/( 1 - aL). 
This property of lag operators may not seem intuitive, but it follows 
directly from properties 2 and 3 above. 

Proof: Multiply each side by (1 - aL) to form (l - oL)(l + aL + a 2 L 2 + 
o' I? + ...)y, =y r Multiply the two expressions to obtain (1 - aL + aL - a 2 L 2 
+ a 2 L 2 cfL? + ...)>> = ». Given that |a| < 1, the expression a"L"y, con- 
verges to zero as n — * oo. Thus, the two sides of the equation are equal. 

6. For |a| > 1, the infinite sum [1 + (aL)~ ] + (aL)~ 2 + ( aL )" 3 + ...]>>, = 

-aLy t l{ 1 - aL). Thus, 

OO 

v, /(I - aL) = -{aL )- 1 >'i 

i = 0 

Proof. Multiply by ( I - aL) to form ( I - aL)[ 1 + (a/,)' 1 + ( aL)~ 2 + 

( aL )" 3 + = -aLy r Perform the indicated multiplication to obtain [1 - 

aL + ( aL )~ 1 -1 + {aL)" 2 - { aL )~ 1 + (aL)" 2 - (aL)" 2 ...] v, = —aLy,. Given 
that |oj > 1 , the expression a n L~ n y l converges to zero as n — > oo. Thus, the 
two sides of the equation are equal. 

Lag operators provide a concise notation for writing difference equations. Using 
lag operators, we can write the pth-order equation y, = a 0 + cq v,_| + . . . + a p y,„ + e, as 

( 1 - a | L - a 2 L 2 - ... - a p LP)y, = a 0 +s, ... .. 

or, more compactly, as 

A(L)y, = a 0 + e, 

where A(L) is the polynomial (1 - a \L - a 2 L 2 - ... - a p LP) 

Since A(L) can be viewed as a polynomial in the lag operator, the notation A(l) 
is used to denote the sum of the coefficients 

■4(1) = 1 -«i ~a 2 ...~a p 

As a second example, lag operators can be used to express the equation y, = a G + 
a\y,-\ + +a p y,_ p + £ l +/3 l £ l _ , + ... + 0 q e M as 

A(L)y, = a Q + B(L)s, 

where A(L) and B(L) are polynomials of orders p and q , respectively. 

It is straightforward to use lag operators to solve linear difference equations. 
Again consider the first-order equation y, = a (t + a | y,_ | + t, where |cq | < 1. Use the 
definition of L to form 



y, = a 0 + a, Ly, + e, 



(1.78) 







40 CHAPTER 1 DIFFERENCE EQUATIONS 




Solving for_p,, we obtain 



a 0 + £ i 



From property 1, we know that La 0 = a Q , so that <ar 0 /( 1 - a^L) = a {) + a,ct {) + a^ 2 a 0 
+ ... =o 0 /( 1 -a |). From property 5, we know that e,l( 1 -a t L) = e t + a [ s l _ [ +a { 2 e,_ 2 
+ .... Combining these two parts of the solution, we obtain the particular solution 
given by (1.21). 

For practice, we can use lag operators to solve (1 .67): y, = a 0 + «|)',_| + e ( + 
where |cq | < 1 . Use property 2 to form ( 1 - a\ L)y t = r/ () + ( I + //, /,)f,. Solving lory, yields 

y, = [a 0 + (l + /?i £)£,]/( I 

so that 

y,= [<7 0 / (I -a,)] + [£■,/ (1 - a\L)\ + [/?,£>_, / (I -o,L)] (1.80) 

Expanding the last two terms of (1.80) yields the same solution found using the 
method of undetermined coefficients. 

Now suppose y, = + ci\y,_] + s, but |c/j J > I . The application of property 5 to 

(1.79) is inappropriate because it implies that y, is infinite. Instead, expand (1.79) 
using property 6: 

y, = t~~ ~ (V0~‘E 



— E M ,f »+i 

"l i=0 



- rEv -'n 



Lag Operators in Higher-Order Systems 

We can also use lag operators to transform the nth-order equation y, = a 0 + < 3 |y,_| + 
02)’ i-2 + ■•■ + o n into 

(1 - a\L - a 2 L 2 - ... - a n L")y, = a 0 + s, 
or 

)’i = (on + £■/)/(! ~ a\L - a 2 L 2 ~ ■■■ «„L n ) 

From our previous analysis (also see Appendix 1.2), we know that the stability 
condition is such that the characteristic roots of the equation n" - a, n" -1 -... - a n = 0 
all lie within the unit circle. Notice that the values of a solving the characteristic equa- 
tion are the reciprocals of the values of L that solve the equation 1 - a^L ... — a n L" = 
0. In fact, the expression 1 - a^L ... - a n L n is often called the inverse characteristic 
equation. Thus, in the literature, it is often stated that the stability condition is for the 
characteristic roots of ( 1 -a { L ... - a n L") to lie outside of the unit circle. 



SUMMARY 41 



In principle, one could use lag operators to actually obtain the coefficients of the 
particular solution. To illustrate using the second-order case, consider y, = (a 0 + e t )/( 1 
- fljL - a 2 L 2 ). If we knew the factors of the quadratic equation were such that (1 - 
a\L - a 2 L 2 ) = (1 - b\L){\ - b 2 L), we could write 

y, = (a 0 + e,)/[(l -6,L)(1 - b 2 L)\ 

If both b | and b 2 are less than unity in absolute value, we can apply property 5 
to obtain « _.. 



[c( 0 /(l -/j,)1 + E^I £ t-i 




1 -- b 2 L 



Reapply the rule to a 0 /( 1 - b |) and to each of the elements in the summation E/q' 
e t _j to obtain the particular solution. If you want to know the actual coefficients of the 
process, it is preferable to use the method of undetermined coefficients. The beauty 
of lag operators is that they can be used to denote such particular solutions succinctly. 
The general model 

A{L)y, = a Q + B(L)e, 1 J 

has the particular solution 

y, = %/A(L) + B{L)£,/A(L) 

As suggested by (1.82), there is a forward-looking solution to any linear difference 
equation. This text will not make much use of the forward-looking solution since 
future realizations of stochastic variables are not directly observable. Some of the 
details of forward-looking solutions can be found at www.cba.ua.edu/~enders. 

10. SUM MARY 

Time-series econometrics is concerned with the estimation of difference equations containing 
stochastic components. Originally, time-series models were used for forecasting. Uncovering 
the dynamic path of a series improves forecasts because the predictable components of the 
series can be extrapolated into the future. The growing interest in economic dynamics has 
given a new emphasis to time-series econometrics. Stochastic difference equations arise quite 
naturally from dynamic economic models. Appropriately estimated equations can be used for 
the interpretation of economic data and for hypothesis testing. 

This introductory chapter focused on methods of "solving” stochastic difference equations. 
Although iteration can be useful, it is impractical in many circumstances. The solution to a linear 
difference equation can be divided into two parts: a particular solution and a homogeneous solu- 
tion. One complicating factor is that the homogeneous solution is not unique. The general solu- 
tion is a linear combination of the particular solution and all homogeneous solutions. Imposing n 
initial conditions on the general solution of an nth-order equation yields a unique solution. 

The homogeneous portion of a difference equation is a measure of the disequilibrium in the 
initial period(s). The homogeneous equation is especially important in that it yields the charac- 
teristic roots; an nth-order equation has n such characteristic roots. If all of the characteristic 
roots lie within the unit circle, the series will be convergent. As you will see in Chapter 2, there 



3 









42 CHAPTER 1 DIFFERENCE EQUATIONS 

is a direct relationship between the stability conditions and the issue of whether an economic 
variable is stationary or nonstationary. 

The method of undetermined coefficients and the use of lag operators are powerful tools 
for obtaining the particular solution. The particular solution will be a linear function of the cur- 
rent and past values of the forcing process. In addition, this solution may contain an intercept 
term and a polynomial function of time. Unit roots and characteristic roots outside of the unit 
circle require the imposition of an initial condition for the particular solution to be meaningful 
Some economic models allow for forward-looking solutions; in such circumstances, antici- 
pated future events have consequences for the present period. 

The tools developed in this chapter are aimed at paving the way for the study of time-series 
econometrics. It is a good idea to work all of the exercises presented below. Characteristic roots 
the method of undetermined coefficients, and lag operators will be encountered throughout the 
remainder of the text. 

QU ESTIO NS AND EXERCISES 

1. Consider the difference equation y, = o 0 + a,y,_, with the initial condition y 0 . Jill solved 
the difference equation by iterating backward 

y, = a o + a \y,-\ 

= o 0 + a,(o 0 + f7iv, 2 ) 

= o 0 + o n i7| +o„a| : + ... +a 0< 7 | <-' +o,V 0 

Bill added the homogeneous and particular solutions to obtain y, = aJt i _ + ,, n , 

VC -«,)]■ ll( ’ 

a. Show that the two solutions are identical for \a x | < 1. 

b. Show that for o, = 1, Jill’s solution is equivalent toy, = a Q l +>> 0 . How would you 
use Bill s method to arrive at this same conclusion in the case that a t = 1? 

2. 1 he cobweb model in Section S assumed static price expectations. Consider an alterna- 
tive formulation called adaptive expectations. Let the expected price in l (denoted by 
be a weighted average of the price in /-I and the price expectation of the previous " 
period. Formally, 

Pi = op ,- i + U - a)p '_ , 0 < a < 1 

Clearly, when a= 1, the static and adaptive expectations schemes are equivalent An 
interesting feature of this model is that it can be viewed as a difference equation express- 
ing the expected price as a function of its own lagged value and the forcing variable p t , 

a. Find the homogeneous solution for />*. 

b. Use lag operators to find the particular solution. Check your answer by substituting 
your answer into the original difference equation. 

3. Suppose that the money supply process has the form m, = m + pm., + c„ where m is a 
constant and 0 < p < 1 . 

a. Show that it is possible to express m t+n in terms of the known value m, and the 
sequence {e, + 1 , s, +2 , ..., £,+„}■ 

b. Suppose that all values of e l+i for / > 0 have a mean value of zero. Explain how you 
could use your result in part a to forecast the money supply n periods into the future. 

4. Find the particular solutions for each of the following: 
a - yt = a i y,_ l +e, + 



' ' ■ .... -e, : 

QUESTIONS AND EXERCISES 43 

b- }’i = + £\, + /3e 2 , (Hint: The form of the solution is 

>’( = Sc f £, M ' + Edfrt-i) 

5. The unit root problem in time-series econometrics is concerned with characteristic roots 
that are equal to unity. In order to preview the issue: 

a. Find the homogeneous solution to each of the following (Hint: Each has at least one 
unit root): 

'• y t = 1 -5.1', -I - 0- 5 Ti- 2 + E t ii- y, = y,_ 2 + S, 

iii. y, = 2>>,_| 2 + £, iv. y, = y,_ t + 0.25 y,_ 2 - 0.25 iy,_ 3 + e, 

b. Show that each of the backward-looking solutions is not convergent. 

c. Show that Equation / can be written entirely in first differences; that is, A y, — 

0.5 Ay, i + e t . Find the particular solution for Ay,. (Hint: Find the particular solution 
for the {Ay,} sequence in terms of the {£,} sequence.) 

d. Similarly transform the other equations into their first-difference form. Find the 
backward-looking particular solution, if it exists, for the transformed equations. 

e. Given an initial condition y 0 , find the solution for: y, = o 0 -y,_, + £ 

6. A researcher estimated the following relationship for the inflation rate (w,): 

?t, — -0.05 + 0.7 7r,_ | + 0.6 tt,_2 + £, 

a. Suppose that in periods 0 and 1, the inflation rate was 10 percent and 1 1 percent, 
respectively. Find the homogeneous, particular, and general solutions for the infla- 
tion rate. 

b. Discuss the shape of the impulse response function. Given that the United States is 
not headed for runaway inflation, why do you believe that the researcher’s equation 
is poorly estimated? 

7. Consider the stochastic process y, = a 0 + a 2 v ,_2 + e t . 

a. I-ind the homogeneous solution and determine the stability condition. 

b. Find the particular solution using the method of undetermined coefficients. 

8. Consider the Cagan (1956) demand for money function in which m, - p, = a - - p,). 

a. Show that the backward-looking particular solution for p , is divergent. 

b. Obtain the forward-looking particular solution for p t in terms of the {m,} sequence. 
In forming the genera! solution, why is it necessary to assume that the money mar- 
ket is in long-run equilibrium? 

c. Find the impact multiplier. How does an increase in m l+ j affect /?,? Provide an intu- 
itive explanation of the shape of the entire impulse response function. 

9. For each of the following, verify that the posited solution satisfies the difference equa- 
tion. The symbols c, c 0 , and a Q denote constants. 

Equation Solution 

(a) v, -.v,., - 0 y, = c 

(•>)>'< -JV-t = "o y, = c + a 0 t 

( c )T,-r ,- 2 = 0 y, = c + c 0 (-iy 

(4) V, -y,-2 ~ S, y, = C + c„(-l )'+£, + e,_ 2 + £,^ + ... 

10. Part 1 : For each of the following, determine whether {y,} represents a stable process. 
Determine whether the characteristic roots are real or imaginary and whether the real 
parts arc positive or negative. 



i m Hi #->« il»t.i«ii,,T t ^i{4^ g lui *. 



44 CHAPTER 1 DIFFERENCE EQUATIONS 



a. y, 1 .2y,_| + 0.2 y,_ 2 b. y, - 1 .2y ( _, + 0.4y,_ 2 

c.y, - 1.2y ( _| - 1.2y,_ 2 d . y, + 1.2y ( _, 

e- T, - 0.7y,_, - 0.25y,_ 2 + 0. 1 75 y,_ 3 = 0 

[W'lT (a - 0.5)(x + 0.5 )(jc - 0.7) = x 3 - 0.7x 2 - 0.25x + 0. 1 75.] 

Part 2. Write each ot the above equations using lag operators. Determine the characteris- 
tic roots of the inverse characteristic equation. 

11. Consider the stochastic difference equation 

)’i = 0- 8 5'(-! + £) - 0.5c, _ | 

a. Suppose that the initial conditions are such thaty 0 = 0 and £q = £_, = 0. Now suppose 
that C| = 1 . Determine the values y, through y 5 by forward iteration. 

b. Find the homogeneous and particular solutions. 

c. Impose the initial conditions in order to obtain the general solution. 

d. Trace out the time path of an e, shock on the entire time path of the {y,} sequence. 

12. Use equation (1.5) to determine the restrictions on o and /f necessary to ensure that the 
{y,} process is stable. 

ENDNOTES 

1 . Another possibility is to obtain the fomard-looking solution; such solutions are discussed in Section 1 0. 

2. Alternatively, you can substitute ( 1 .26) into ( 1 . 1 7). Note that when e, is a pure random disturbance, 
)’i = a o +y,-i + £, is called a random walk plus drift model. 

3. Any linear equation in the variables jr , through x„ is homogeneous if it has the form rq.v, + o>v, + 

... + y, = 0. To obtain the homogeneous portion of (1.10), simply set the intercept term and 
the forcing process v, equal to zero. Hence, the homogeneous equation lbr (1.10) is r, = a, v, -t 
a 2>’,-2 + ••• + r 

4. If b > a, the demand and supply curves do not intersect in the positive quadrant. The assumption a 
> b guarantees that the equilibrium price is positive. 

5. For example, if the forcing process is x, - + /Ts ,-2 + ..., the impact multiplier is the par- 

tial derivative of v# with respect to s,. 

APPENDIX 1.1: Imaginary Roots and 
de Moivre's Theorem 

Consider a second-order difference equation y, = cq v, _ j + rijY/.-o such that the dis- 
criminant d is negative [i.e., d — ci \ 2 + 4fl 2 < 0], F rom Section 6, we know that the full 
homogeneous solution can be written in the form 

y, h =W + A 2 a 2 ' V' : (A 1.1) 

where the two imaginary characteristic roots are 

°1 = ( a i +i\f—d)l 2 and a 2 = («, -i4-d)l 2 (A 1 .2) 

The purpose of this appendix is to explain how to rewrite and interpret (A 1.1) in 
terms of standard trigonometric functions. You might first want to refresh your mem- 
ory concerning two useful trig identities. For any two angles and 

smf#, + 0 t) = sin(77,)cos(^) + cos(0,)sin(^) 
cos(<7, + 82 ) = cos((?,)cos(4) - sin((9|)sin(^) 



(A 1 .3) 



APPENDIX 1.1; IMAGINARY ROOTS AND DE MOIVRE'S THEOREM 45 

If 0\ = &i, we can drop subscripts and form 
sin(2<9) = 2sin(6)cos(<9) 

cos(2<9) = cos(0)cos(0) - sin(^sin(^ (A 1 .4) 

The first task is to demonstrate how to express imaginary numbers in the com- 
plex plane. Consider Figure A 1.1 in which the horizontal axis measures real numbers 
and the vertical axis measures imaginary numbers. The complex number a + hi can 
he represented by the point a units from the origin along the horizontal axis and b 
units from the origin along the vertical axis, it is convenient to represent the distance 
from the origin by the length of the vector denoted by r. Consider angle 0 in triangle 
0 ab and note that cos(iT) = ah- and sin($ = b/r. Hence, the lengths a and b can°be 
measured by 

a = r cos(0) and b = r sin(tf) 

In terms of (A 1.2), we can define a = o,/2 and b= Vd/2. Thus, the characteris- 
tic roots o-| and op can be written as: 

o'l = a + bi ~ r[cos((7) + / sin((9)] 

o.p = a- bi = r[cos(i9) - i sin(^)] (A1 .5) 

The next step is to consider the expressions o-,' and op'. Begin with the expres- 
sion ftp and recall that i 2 = -1 : 

0 | 2 = {/-[cos(0> + i sin(O)]}{r[cos(0) + i sin(0)]j 
= r 2 [cos(ff)cos(ff) - sin(<9)sin(6) + 2/ sin(^cos(^] 

From (A 1.4), 

o'l 2 = r 2 [cos(2(?) + / sin(2<5)] 

It we continue in this fashion, it is straightforward to demonstrate that 
a,' = r'[cos(/(7) + i sin(i0)] and op' = r / [cos(fi?) - i sin(/0)] 




FWDRE A 1.1 A Graphical Representation of Complex Numbers 












Since is a real number and ct) and a 2 are complex, it follows that A t and A 2 
must be complex. Although A | and A 2 are arbitrary complex numbers, they must have 
the form 

A | = 5|[cos(B 2 ) + i sin(S 2 )] and A 2 = 7?i[cos(Z? 2 ) - ' sin(Z? 2 )] (A 1.7) 
where B\ and B 2 are arbitrary real numbers measured in radians. 

In order to calculate /t|(«i'), use (A1.6) and (A1.7) to form 

A | a/ = 5 1 [cos (By) + i sin(B 2 )]H[cos(t(?) + i sin(/f?>] 

= 5|H[cos(5 2 )cos(/6) - sin(B 2 )sin(tty + i cos(/0sin(5 2 ) + i sin(/tycos(B 2 )] 

Using (A 1.3), we obtain 

A | a/ = B\r'[cos(tt)+ B 2 ) + i si n(iff+ B 2 )] (A 1 .8) 

You should use the same technique to convince yourself that 

A 2 o 2 ' = B ] r'[cos(fff + B 2 ) - i sin(/t?+ B 2 )\ (A 1 .9) 

Since the homogeneous solution v/' is the sum of (A 1.8) and (A 1.9), 

y/’ = /? 1 r , [cos(/t?+ B 2 ) + i sin(ti7+ B 2 )] + B p-'[cos(tO + B 2 ) - / sin(/(7+ B^)] 

= 2B x r‘cos(tff+ B 2 ) (A 1.10) 

Since /?] is arbitrary, the homogeneous solution can be written in terms of the 
arbitrary constants B 2 and 

y/’ = B i r’cos(lff+ B 2 ) { A 1 . 1 1 ) 

Now imagine a circle with a radius of unity superimposed on Figure A 1.1. The 
stability condition is for the distance r = Ob to be less than unity. Hence, in the liter- 
ature it is said that the stability condition is for the characteristic root(s) to lie within 
this unit circle. 



& 



APPENDIX 1.2: Characteristic Roots in 
Higher-Order Equations 

The characteristic equation to an nth-order difference equation is 

<\" - ci | n' 1 " 1 - a 2 <\ n ~~ ... - a n = 0 (A 1 . 1 2) 

As stated in Section 6, the n values of a which solve this characteristic equation 
are called the characteristic roots. Denote the n solutions by tv,, n- 7 , ..., n n . Given 
the results in Section 4, the linear combination A | a x ’ + A 2 a 2 ' + ... + A n a n ' is also a 
solution to (A 1.12) 

A priori , the characteristic roots can take on any values. There is no restriction 
that they be real versus complex nor any restriction concerning their sign or magni- 
tude. Consider the possibilities: 

1. All the (Xj are real and distinct There are several important subcases. 
First suppose that each value of n ; - is less than unity in absolute value. In 
this case, the homogeneous solution (A 1.1 2) converges since the limit of 
each a/ equals zero as t approaches infinity. For a negative value of 07, 




-'-r-i- ... 

APPENDIX 1.2: CHARACTERISTIC ROOTS IN HIGHER-ORDER EQUATIONS 47 

the expression a( is positive for even values of t and negative for odd val- 
ues of i. Thus, if any of the a,- are negative {but less than one in absolute 
value), the solution will tend to exhibit some oscillation. If any of the a- t 
are greater than unity in absolute value, the solution will diverge. 

2. All of the a,- are real but m < n of the roots are repeated. Let the solution 
be such that a ( = o: 2 = ... = a m . Call the single distinct value of this root a 
and let the other n-m roots be denoted by a m+ \ through a„. In the case of a 
second-order equation with a repeated root, you saw that one solution was 
A | o' and the other was A 2 t7i'. With m repeated roots, it is easily verified that 
ta\ fin', ..., t m -' n’ are also solutions to the homogeneous equation. With m 
repeated roots, the linear combination of all these solutions is 

/4,<v' + A 2 tn' +A 3 r a +... + A m t"' V +A m+I o.;„ + | +... + A„ 0 '' ! 




3. Some of the roots are complex. Complex roots (which necessarily come 
in conjugate pairs) have the form n ( ± id, where and #are real numbers 
and / is defined to be V^T. For any such pair, a solution to the homoge- 
neous equation is: A t (a ( + iff)' + A 2 (<*\ - iff) 1 where A t and A 2 are arbitrary 
constants. Transforming to polar coordinates, the associated two solutions 
can be written in the form: /\r‘cos((k + /A) with arbitrary constants f3 x and 
/Tj. Here stability hinges on the magnitude of r'; if | r|<l , the system con- 
verges. However, even if there is convergence, convergence is not direct 
because the sine and cosine functions impart oscillatory behavior to the 
time path of y r For example, if there are three roots, two of which are 
complex, the homogeneous solution has the form 

/Vj r' cos (ff/ + @2) + A- 3(0-3)' 

Stability of Higher-Order Systems: in practice, it is difficult to find the actual values 
of the characteristic roots. Unless the characteristic equation is easily factored, it is 
necessary to use numerical methods to obtain the characteristic roots. Fortunately soft- 
ware packages such as Mathematica, Maple, or Mathcad can easily obtain the charac- 
teristic roots of any specific difference equation. At one time it was popular to use the 
Schur Theorem to determine whether all of the roots lie within the unit circle. Rather 
than calculate all of these determinants, it is often possible to use the simple rules dis- 
cussed in Section 6. Those of you familiar with matrix algebra may wish to consult the 
first edition of this text or Samuelson (1941) for the appropriate conditions. 




STOCHASTIC DIFFERENCE EQUATION MODELS 49 




STATIONARY TIME-SERIES 
MODELS 



| 



The theory of linear difference equations can be extended to allow the forcing process 
{ x ,} to be stochastic. This class of linear stochastic difference equations underlies 
much of theory of time-series econometrics. Especially important is the Box-Jenkins ■ 
(1976) methodology for estimating time-series models of the form 

y, = % + a 1 y,_, + ...+ a p y ,_ p + £, + /?,£,_, + ...+ 

Such models are called autoregressive integrated moving-average (AR1MA) 
time-series models. This chapter has three aims: 

1. Present the theory of stochastic linear difference equations and consider 
the time-series properties of stationary ARIMA models; a stationary 
AR1MA model is called an autoregressive moving-average (ARMA) 
model. It is shown that the stability conditions of the previous chapter are 
necessary conditions for stationarity. 

2. Develop the tools used in estimating ARMA models. Especially useful are 
the autocorrelation and partial autocorrelation functions. It is shown how 
the Box-Jenkins methodology relies on these tools to estimate an ARMA 
model from sample data. 

3. Consider various test statistics to check for model adequacy. Several exam- 
ples of estimated ARMA models are analyzed in detail. Ii is shown how a 
properly estimated model can be used for forecasting. 

1. STOCHASTIC DIFFERENCE EQUATION MODELS 

In this chapter, we continue to work with discrete, rather than continuous, time- 
series models. Recall from the discussion in Chapter 1 that we can evaluate the func- 
tion v —J[t) at r 0 and r 0 + h to form 

At + h) - M ,) 

As a practical matter, most economic time-series data are collected for discrete 
time periods. Thus, we consider only the equidistant intervals / 0 , t^+h, t^+lh, t 0 + 
3/7, ..., and conveniently set h = 1. Be careful to recognize, however, that a discrete 
time-series implies that /, but not necessarily yy, is discrete. For example, although 
Scotland’s annual rainfall is a continuous variable, the sequence of such annual 



rainfall totals for years 1 through t is a discrete time series. In many economic 
applications, t refers to “time” so that h represents the change in time. However, t 
need not refer to the type of time interval as measured by a clock or calendar. 
Instead of allowing our measurement units to be minutes, days, quarters, or years, 
I can refer to an ordered event number. We could let y, denote the outcome of spin 
/ on a roulette wheel; y t can then take on any of the 3 8 values 00, 0, 1 , .... 3 6. 

A discrete variable;' is said to be a random variable (i.e., stochastic) if for any 
real number r there exists a probability p{y < r) thaty will take on a value less than 
or equal to r. This definition is fairly general; in common usage, it is typically implied 
that there is at least one value of r for which 0 < p{y = r) < 1 . If there is some r for 
which p(y = r)= 1 , y is deterministic rather than random. 

It is useful to consider the elements of an observed time series {yo,y\,y->, . ,.,y,} 
as being realizations (i.e., outcomes) of a stochastic process. As in Chapter 1 , we con- 
tinue to let the notation y, to refer to an element of the entire sequence {y,}. In our 
roulette example, y ( denotes the outcome of spin t on a roulette wheel. If we observe 
spins 1 through T, we can form the sequence y],y 2 , ■ ■.,yj or, more compactly, {y,}. 
In the same way, the termy, could be used to denote gross domestic product (GDP) 
in time period 1 . Since we cannot forecast GDP perfectly, y, is a random variable. 
Once we learn the value of GDP in period /, y, becomes one of the realized values 
from a stochastic process. (Of course, measurement error may prevent us from ever 
knowing the “true” value of GDP.) 

For discrete variables, the probability distribution ofy, is given by a formula (or 
table) that specifies each possible realized value of y, and the probability associated 
with that realization. If the realizations are linked across time, there exists the joint 
probability distribution p(y t = /'|,y 2 = r 2 , ...,y T = r T ) where /•, is the realized value of 
y in period /. Having observed the first t realizations, we can form the expected value 
ofy, + |,y ;+2 , ..., conditioned on the observed values ofy, through y,. This conditional 
mean, or expected value, ofy, +/ is denoted by £,[y, +/ 1 y,,y,_|, . . . , y , ] or T,v l+i . 

Of course, if y, refers to the outcome of spinning a fair roulette wheel, the prob- 
ability distribution is easily characterized. In contrast, we may never be able to com- 
pletely describe the probability distribution for GDP. Nevertheless, the task of 
economic theorists is to develop models that capture the essence of the true data- 
generating process. Stochastic difference equations are one convenient way of mod- 
eling dynamic economic processes. To take a simple example, suppose that the 
Federal Reserve’s money supply target grows 3 percent each year. Hence, 

m* = 1.03m*_, (2.1) 

so that, given the initial condition «?* , the particular solution is 

777, =(1 .03) , 777q 

where :m , = the logarithm of the money supply target in year t 

rn 0 = the initial condition for the target money supply in period zero 

Of course, the actual money supply (m,) and the target need not be equal. 
Suppose that at the end of period t- 1, there exists m , outstanding dollars that are 
carried forward into period t. Hence, at the beginning of t there are m , dollars so that 



48 




50 CHAPTER 2 STATlONARYTIME-SERiES MODELS 



the gap between the target and the actual money supply is m* - m ,_\ . Suppose that the 
Fed cannot perfectly control the money supply but attempts to change the money sup- 
ply by p percent ( p < 100%) of any gap between the desired and actual money sup- 
ply. We can model this behavior as 

* 

A m t = p[m t -m t _ \] + £ t 

or using (2.1), we obtain 

m, =p(\.03)‘in 0 +(l -p)m t _\ + £, (2.2) 

where e, is the uncontrollable portion of the money supply. 

We assume the mean of e, is zero in all time periods. 

Although the economic theory is overly simple, the model does illustrate the key 
points discussed above. Note the following: 

1. Although the money supply is a continuous variable, (2.2) is a discrete dif- 
ference equation. Since the forcing process {e,} is stochastic, the money sup- 
ply is stochastic; wc can call (2.2) a linear stochastic difference equation. 

2. If we knew the distribution of if,), we could calculate the distribution for 
each element in the {m t ) sequence. Since (2.2) shows how the realizations 
of the { m ,} sequence are linked across time, we would be able to calculate 
the various joint probabilities. Notice that the distribution of the money 
supply sequence is completely determined by the parameters of the differ- 
ence equation (2.2) and the distribution of the {c,} sequence. 

3. Having observed the first t observations in the {//?,} sequence, we can 
make forecasts of m l+l , m, +2 , . .. For example, updating (2.2) by one 
period and taking the conditional expectation, the forecast of is: 
E,m l+ 1 = ft\ .03) ,+ l wio+ (1 -/')"', 

Before we proceed too far along these lines, let’s go back to the basic building 
block of discrete stochastic time-series models: the white-noise process. A sequence 
{£.} is a white-noise process if each value in the sequence has a mean of zero, a con- 
stant variance, and is uncorrelated with all other realizations. Formally, if the notation 
E(x) denotes the theoretical mean value of x, the sequence {£,} is a white-noise 
process if for each time period t 
)■-=£(*•,-,)=•••- 0 

E(sf) = £(£;_, ) = ... = o 2 [or var(f,) = var(f,_, ) = ... = o-] 

E(e, £,_,.) = E{e h £ h .. s ) = 0 for all / and 5 [or cov(f,, e,_ s ) = cov(f,_ y , E hJ _ s ) = 0] 

In the remainder of this text, {f,} will always refer to a white-noise process and 
cP- will refer to the variance of that process. When it is necessary to refer to two or 
more white-noise processes, symbols such as {e„} and {e 2i \ will be used. Now, use 
a white-noise process to construct the more interesting time-series 

9 

x, = £>,■*/-/ (2-3) 

i=() 

For each period t, x, is constructed by taking the values £>, £>_|, and mul- 

tiplying each by the associated value of (5 t A sequence formed in this manner is called 



— i - myj j i jjg njT 

ARMA MODELS 51 

a moving average of order q and is denoted by MA(^). To illustrate a typical moving 
average process, suppose you win $1 if a fair coin shows a head and lose $1 if it shows 
a tail. Denote the outcome on toss l by e, (i.e., for toss t, e, is either +$1 or -SI). If you 
want to keep track of your hot streaks, you might want to calculate your average win- 
nings on the last four tosses. For each coin toss t, your average payoff on the last four 
tosses is l/4f, + l/4e-,_| + l/4f,_ 2 + l/4,c,_ 3 . In terms of (2.3), this sequence is a mov- 
ing average process such that ff = 0.25 for i < 3 and zero otherwise. 

Although the {£,} sequence is a white-noise process, the constructed {x,} sequence 
will not be a white-noise process if two or more of the ft t differ from zero. To illustrate 
using an MA(1 ) process, set /5 0 = l , //] = 0.5, and all other ft = 0. In this circumstance, 

E(x,) = E(e, + 0.5f,_|) = 0 and var(x,) = var(£, + 0.5e,_|) = 1 ,25a 2 . You can easily con- 
vince yourself that E(x,) = E(. v,_ v ) and that var(.r,) = var(x,_ v ) for all t. Hence, the first 
two conditions for {x,} to be a white-noise process are satisfied. However E(x,x t _ ( ) = 

E[(e, + 0.5f,_i)(f,_i + 0.5f,_ 2 )] = £(£,£,_, + 0.5(£,_ t ) 2 + 0.5£-,£ ( _ 2 + 0.25£,_ ( £,_ 2 ) = 0.5O 2 . 

Given that there exists a value of i ^ 0 such that E(x l x l __ s ) ^ 0, the {x,} sequence is not 
a white-noise process. 

IZxercisc 1 at the end of this chapter asks you to find the mean, variance, and 
covariance of your hot streaks in coin tossing. For practice, you should work that 
exercise before continuing. 

2. ARMA MODELS 

It is possible to combine a moving average process with a linear difference equation 
to obtain an autoregressive moving average model. Consider the p- th order differ- 
ence equation: 

v, = «o + Y'/'i >’,-i + x t (2.4) 

/=l 

Now let {x,} be the MA (q) process given by (2.3) so that we Can write 

/’ </ 

y, = a 0 + a i y '-' + Y2 (2-5) 

i=l /=0 

We follow the convention of normalizing units so that ft} is always equal to 
unity. If the characteristic roots of (2.5) arc all in the unit circle, {}’,} is called an 
autoregressive moving-average (ARMA) model fory,. The autoregressive part of 
the model is the difference equation given by the homogeneous portion of (2.4) and 
the moving average part is the {x,} sequence. If the homogeneous part of the differ- 
ence equation contains p lags and the model for x, contains q lags, the model is 
called an ARMA(p, q ) model. If q = 0, the process is called a pure autoregressive 
process denoted by AR(p) and ifp = 0, the process is a pure moving-average process 
denoted by MA^). In an ARMA model, it is perfectly permissible to allow p and/or 
q to be infinite, in this chapter we consider only models in which all of the charac- 
teristic roots of (2.5) arc within the unit circle. However, if one or more characteris- 
tic roots of (2.5) is greater than or equal to unity, the {y,} sequence is said to be an 
integrated process and (2.5) is called an autoregressive integrated moving average 
(ARIMA) model. 



52 CHAPTER 2 STATION ARYTIME-SERIES MODELS 




Treating (2.5) as a difference equation suggests that we can “solve” for y t in terms 
of the {£,} sequence. The solution of an ARMA(/j, q) model expressing y, in terms of 
the {£,} sequence is the moving-average representation of y r The procedure is no 
different from that discussed in Chapter 1 . For the AR(1 ) model y, = a 0 + a i y,_| + e t , 
the moving-average representation was shown to be 

OCJ 

y, — a n / (* — «i ) + Yl a \ £ i - 1 

i = 0 

For the general ARMA(p, q) model, rewrite (2.5) using lag operators so that 
l> <t 

( 1 - E a-, V ) v, = a 0 + E 

i = l /=() 

so that the particular solution for;', is 

y, = «o + E (¥t-i K 1 - E £' (2-6) 

/= 0 J i=l 

Fortunately, it will not be necessary for us to expand (2.6) to obtain the specific 
coefficient for each element in {£,}. The important point to recognize is that the 
expansion will yield an MA(oo) process. The issue is whether such an expansion is 
convergent so that the stochastic difference equation given by (2.6) is stable. As you 
will see in the next section, the stability condition is that the roots of the polynomial 
(1 - Hap) must lie outside of the unit circle. It is also shown if y, is a linear stochas- 
tic difference equation, the stability condition is a necessary condition for the time- 
series {y,} to be stationaiy. 

3. STATI C) IMARIT Y 

Suppose that the quality control division of a manufacturing firm samples four 
machines each hour. Every hour, quality control finds the mean of the machines’ out- 
put levels. The plot of each machine’s hourly output is shown in Figure 2. 1 . I ft’,, rep- 
resents machine y/s output at hour /, the means (v,) are readily calculated as 

y,=Y,yu IA 

;=i 

For hours 5, 10, and 15, these mean values arc 4.61, 5.14, and 5.03, respectively. 

The sample variance for each hour can similarly be constructed. Unfortunately, 
applied econometricians do not usually have the luxury of being able to obtain an 
ensemble (i.e., multiple time-series data of the same process over the same time 
period). Typically, we observe only one set of realizations for any particular series. 
Fortunately, if {y , } is a stationary series, the mean, variance, and autocorrelations 
can usually be well approximated by sufficiently long time averages based on the 
single set of realizations. Suppose you observed only the output of machine 1 for 20 



STATIONARY 53 




Hour 

FIGURE 2.1 Hourly Output of Four Machines 

periods. If you knew that the output was stationary, you could approximate the mean 

level of output by i ■■ 

20 

j^E-V 20 

/= I 

In using this approximation, you would be assuming that the mean was the same 
for each period. Formally, a stochastic process having a finite mean and variance is 
covariance stationary if for all t and t - s, 

£0’,) = E(y,p = /'• (2.7) 

E[(y t “ A) 2 ] = £[()',-, - 1 1 ) 1 ] = [varO,) = var(y,_,) = o(] (2.8) 

E[iy, - //•)() /<)] = £[0 i-j - /')C f,-/-,. -/')] = 7s 

[co v(y„y,_ v ) = cov(r, ,.y, , = y] (2.9) 

where //, o\, and y s are all constants. 

In (2.9), allowing s = 0 means that y 0 is equivalent to the variance of v,. Simply 
put, a time-series is covariance stationary if its mean and all autocovariances are 
unaffected by a change of time origin. In the literature, a covariance stationary 
process is also referred to as a weakly stationary, second-order stationary, or vvidc- 
sense stationary process. (Note that a strongly stationary process need not have a 
finite mean and/or variance). The text considers only covariance stationary series so 
that there is no ambiguity in using the terms stationary and covariance stationary 
interchangeably. One further word about terminology: In multivariate models, the 
term autocovariance is reserved for the covariance between r, and its own lags. 
Cross-covariance refers to the covariance between one series and another. In univari- 
ate time-series models, there is no ambiguity and the terms autocovariance and 
covariance are used interchangeably. 




54 CHAPTER 2 STATIONARYTIME-SERIES MODELS 




For a covariance stationary series, we can define the autocorrelation between v, 
and y,_ T as 

P s =%h b 

where 7 0 and y s arc defined by (2.9). 

Since 7 ; v and 7 0 are time-independent, the autocorrelation coefficients p s are also 
time-independent. Although the autocorrelation between y, and /,_, can difl'er from 
the autocorrelation between/, an dy,_ 2 , the autocorrelation between/, and/,_| must 
be identical to that between y,__ s and y ,_ s _ , . Obviously, p {) = 1 . 

Stationarity Restrictions for an AR(1) Process 

For cxpositional convenience, first consider the necessary and sufficient conditions 
for an AR(1) process to be stationary. Let 

/, = fl {) + fl,/,_, + s, 

where t, = while-noise. 

Suppose that the process started in period zero, so that y 0 is a deterministic ini- 
tial condition. In Section 3 of the last chapter, it was shown that the solution to this 
equation is (see also Question 2 at the end of this chapter) 

/-I /-I 

y, = i + 0 i>’o + XVt £ i-i ( 2 - 1 °) 

(=0 /=() 

Taking the expected value of (2. 1 0), we obtain 

/-I : i 

A v , = i + « i- v o (2-11) 

1=0 

Updating by s periods yields , 

1+.V-I 

= «o +"i + ''>'o (2.12) 

/=() 

Comparing (2.11) and (2.12), it is clear that both means are time dependent. 
Since /;/, is not equal to the sequence cannot he stationary. However, if / is 

large, we can consider the limiting value of/, in (2.10). If | u\ \ < I, the expression 
(fi|)h'o converges to zero as / becomes infinitely large and the sum n () | I + iy + (n j )- 
+ (<;/ 1 + ...) converges to £/,,/( 1 - «,). Thus, as / — > oo and if j a t | < I 

lim y,=r—+fl«\ £ ,~i (2.i3) 

1 "l i-O 

Now take expectations of (2.13) so that for sufficiently large values of/, £y, = 
<;,,/( I ii|). Thus, the mean value of/, is finite and time independent so that AY, = 
Ey,_ s = c/y/f I - ii | ) e: ft for all /. fuming to the variance, we find 

E{v, - p) 2 = E[(q + n,i r,_, + (« , ) 2 £>_2 + -") 2 ] 

= c?[\ +(ii | ) 2 + («,) 4 + ...] = rr2/(l -(a,)2) 



STATIONARITY 55 



which is also finite and time-independent. Finally, it is easily demonstrated that the 
limiting values of all autocovariances are finite and time-independent: 

E[(y,-p)(y^ s -p)] =£{[£) + «i£)_i + (a{j 1 e,_2+ ...] • 

[£,- s + a x e,_ s _ y + («i)V.v-2 + •••]} 

= a 2 (i7,) v [l + (o,) 2 + (a,) 4 +...] = it2(ij,>v/[ 1 - (a,) 2 ] (2.14) 

In summary, if we can use the limiting value of (2.10), the {/,} sequence will be 
stationary. For any given y Q and | a, j < 1, it follows that t must be sufficiently large. 
Thus, if a sample is generated by a process that has recently begun, the realizations 
may not be stationary. It is for this very reason that many econometricians assume 
that the data-generating process has been occurring for an infinitely long time. In 
practice, the researcher must be wary of any data generated from a “new” process. 
For example, {/,} could represent the daily change in the dollar/mark exchange rate 
beginning immediately after the demise of the Bretton Woods fixed exchange rate 
system. Such a series may not be stationary due to the fact there were deterministic 
initial conditions (exchange rate changes were essentially zero in the Bretton Woods 
era). The careful researcher wishing to use stationary series might consider excluding 
some of these earlier observations from the period of analysis. 

Little would change were we not given the initial condition. Without the initial 
value vq, the sum of the homogeneous and particular solutions for y, is 

OO 

y, = ciq /(I — rt| ) + 7: a\ E t-i + A (a , )' (2.15) 

i=0 

where A = an arbitrary constant. 

If you take the expectation of (2.15), it is clear that the {/,} sequence cannot be 
stationary unless the expression A{a{]' is equal to zero. Either the sequence must have 
started infinitely long ago (so that o,' = 0) or the arbitrary constant A must be zero. 
Recall that the arbitrary constant is interpreted as a deviation from long-run equilib- 
rium. The stability conditions can be stated succinctly: 

1. The homogeneous solution must be zero. Either the sequence must have 
started infinitely far in the past or the process must always be in equilib- 
rium (so that the arbitrary constant is zero). 

2. The characteristic root a, must be less than unity in absolute value. 

These two conditions readily generalize to all ARMA(/z q) processes. We know 
that the homogeneous solution to (2.5) has the form 

p 




or if there are m repeated roots, 

m p 

,i 2 j a-/ , + A i a 'i 

1 = 1 1=H! + 1 

where the /(, are all arbitrary constants, a is the repeated root, and the a, are the dis- 
tinct roots. 



56 CHAPTER 2 STATION ARYTIME-SERIES MODELS 




If any portion of the homogeneous equation is present, the mean, variance, and 
all covariances will be time-dependent. Hence, for any ARMA(p, q) model, station- 
arity necessitates that the homogeneous solution be zero. The next section addresses 
the stationarity restrictions for the particular solution. 

4. STATIONARY RESTRICTIONS FOR AN 

A RMA (p, q ) MODEL 

As a prelude to the stationarity conditions for the genera! ARMA(y>, q) model, firs! 
consider the restrictions necessary to ensure that an ARMA(2, I) model is stationary. 
Since the magnitude of the intercept term does not affect the stability (or stationarity) 
conditions, set « u = 0 and write 

V, = 0|.V,_| + «2.>'/-2 + w + /Vh-l (2.16) 

From the previous section, we know that the homogeneous solution must be 
zero. As such, it is only necessary to find the particular solution. Using the method of 
undetermined coefficients, we can write the challenge solution as 

)■, = £ m-i (2.17) 

/=() 

For (2. 1 7) to be a solution of (2. 16), the various a, must satisfy 

c\)£, + + <>2-7-2 + «37-3 + = 0 , |(<'>()£>-l + °'l 7-2 + <>'27-3 + <*37-4 + ••■) 

+ Cl 2 (a {) £,_2 + + <>27-4 + <>37-5 + ■■•) + 7 + P\£ t £\ 

To match coefficients on the terms containing f,_ | , it is necessary 

to set 

1. «o = 1 

2. (i, = fl|rv () + /7, => rv, = a | +/7, 

3. Of,. = | + a 2 0,_ 2 for a H ' — 2 

The key point is that for i > 2, the coefficients satisfy the difference equation 
c\j = <7|0',-_| + ai(\j_ 2 - If the characteristic roots of (2. 16) are within the unit circle, 
the {o ,} must constitute a convergent sequence. For example, reconsider the case 
in which <7| = 1 .6, a 2 = -0.9, and let /7j = 0.5. Worksheet 2.1 shows that the coeffi- 
cients satisfying (2. 1 7) are 1 , 2.1 , 2.46, 2.046, 1 .06, -0. 146, .... (also see Worksheet 
1.2 of the previous chapter). 

To verify that the { y, } sequence generated by (2. 1 7) is stationary, take the expec- 
tation of (2.17) to form Ey, = Ey hl = 0 for all t and i. Hence, the mean is finite and 
time-invariant. Since the {£,} sequence is assumed to be a white-noise process, the 
variance of_y, is constant and time-independent; that is, 

Var(>,) = + a 2 £ t _ 2 + + ...) 2 ] 



OO 




/ = () 



STATIONARY RESTRICTIONS FOR AN ARMA(P, Q) MODEL 57 



WORKSHEET 



COEFFICIENTS OF THE ARMA(2,1) PROCESS: 

~ 0-9y t _ 2 + £, + 0.5£,_, 

If we use the method of undetermined coefficients, the o, must satisfy: 



o, = 1 .6 + 0.5 

a, = 1 ,6n,_| - 0.9n ; _2 



hence, rvj =2.1 
for all i = 2, 3, 4, 



Notice that the coefficients follow a second-order difference equation with imaginary 
roots. If we use de Moivre’s Theorem, the coefficients will satisfy 
n, = 0.949' /7| C os(0. 567/ + //,) 

Imposing the initial conditions for on and eq yields 

l=/V 1 cos(/A) and 2. 1 = 0.949/^ cos(0.567 + /?>) 

Since p\~ l/cos(/3>), we seek the solution to 

cos (//,) - (0.949/2. 1 ) • cos(0.567 + //,) = 0 

From a trig tabic, the solution for/^ is -1.197. Hence, the oy satisfy 
(-1/1 . 197) • 0.949' ■ cos(0.567 / - 1 . 1 97) 

Alternatively, we can use the initial values of cm and o-| to find the other n, by iteration. 
The sequence of the ay is shown in the graph below. 

4 i 1 1 1 




0 20 

The first 10 values of the sequence are 



40 60 



/ 0 12 3 4 5 6 7 8 

o 1.00 2.10 2.46 2.046 1.06 -0.146 1.187 -1.786 -1.761 



Hence, var(r,) = var(v,_ v ) for all / and .v. Finally, the covariance between v, and y,._ v is 



Cov(v,. 


■ 37 l) = £[( 7 + «i7-i 


+ <*2 e f-2 + • 


■)( £ l-\ + <>T7-2 + <>27-3 + <’3 7-4 + ■ 






= C7-( d'l + ft 2 a l 


4- <r-> n -) 4- . . 


■) 




n 

o 


< 37-2 ) = £[(7 + «i7-i 


+ <>27-2 + ■ 


••)(7-2 + <*l7-3 + <>2 7-4 + <>37-5 + • 





= cr 2 (n 2 + ri'jri] + rv 4 n : 2 + ...) 

Cov(v, y, ,.) = cr 2 ( rr v + c i v4 j rv, + ci; v) 2 cv 2 + ...) 



so that 



( 2 . 18 ) 








58 CHAPTER 2 STATIONARY TIME-SERIES MODELS 

Thus, cov(y,, y t _ s ) is constant and independent of t. Conversely, if the character- 
istic roots of (2.16) do not lie within the unit circle, the {a^} sequence will not be con- 
vergent. As such, the {y,} sequence cannot be convergent. 

It is not too difficult to generalize these results to the entire class of ARMA(/>, q) 
models. Begin by considering the conditions ensuring the stationarity of a pure 
MA(oo) process. By appropriately restricting the ft, all of the finite-order MA(<y) 
processes can be obtained as special cases. Consider 

OO 

Xl - XX £ l-i 
1=0 

where {£,} = a white-noise process with variance a 2 . 

We have already determined that {.v,} is not a white-noise process; now the issue 
is whether {x,} is covariance stationary. Given conditions (2.7), (2.8), and (2.9), we 
ask the following: 

1. Is the mean finite and time-independent? Take the expected value of x, and 
remember that the expectation of a sum is the sum ofThe individual expec- 
tations. Therefore, 

E{x t ) = E(e, + ft £ t ~\ + fti £ t-2 + • • •) 

= Ee, + (3\Ee ( ~ i + ftEe^ + ... = 0 

Repeat the procedure with x,_ s 

E(x,_ s ) = E(e,_ s + /?, e,_ s _ x + fte,_ s _ 2 +...) = 0 

Hence, all elements in the {x,} sequence have the same finite mean (ji = 0). 

2. Is the variance finite and time-independent? Form var(x,) as 

Var(x,) = E((s, + ft e,_ } + fte,_ 2 + . . . ) 2 ] 

Square the term in parentheses and take expectations. Since {£,} is a white- 
noise process, all terms E€,e,_ s = 0 for s ^ 0. Hence 

Var(x,) = E(e,) 2 + ft) 2 E(e, A ) 2 + (ft) 2 E(e,_ 2 ) 2 + ... 

= cP-[ 1 + ( ft ) 2 + (ft) 2 + . . .] 

As long as T,(j3 t ) 2 is finite, it follows that var(x,) is finite. Thus, Eft) 2 being 
Unite is a necessary condition for {x,} to be stationary. To determine whether 
var(x,) = var(x,_ 5 ), form 

Var(x,_ v ) = £[(e,_, + ftc,_ s _ x + fte,_ s _ 2 + ...) 2 ] = <fi[l + ft) 2 + (ft) 2 + ...] 
Thus, var(x,) = var(x,_ 5 ) for all t and t-s. 

3. Are all autocovariances finite and time-independent? first form 
E(x,x,_ s ) as 

E[x,x,_ s ] = E[(e,+ fte,_ | + fte t . 2 + + 0\£,- s -\ + ft^s-2 + •••)] 

Carrying out the multiplication and noting that E(£ i e i _ s ) = 0 fori ^ 0, we get 

£(V W ) = + ftPs + 1 + fhP.t+2 + •••) 

Restricting the sum ft + ftft + \ + fhPs+2 + ••• to be finite means that E(x l x r v ) is 
finite. Given this second restriction, it is clear that the covariance between x, and x,_ s 



STATIONARY RESTRICTIONS FOR AN ARMA(P, Q) MODEL 59 



only depends on the number of periods separating the variables (i.e., the value of s) 
but not on the time subscript t. 

In summary, the necessary and sufficient conditions for any MA process to be sta- 
tionary are for the sums (1), £(, 8j) 2 and of (2), (j3 s + 0\ft+\ + ftiPs+2 + • ••) to be finite. 
Since (2) must hold for all values of 5 and/? 0 = 1, condition (1) is redundant. The direct 
implication is that a finite-order MA process will always be stationary. For an infinite- 
order process, (2) must hold for all 5 > 0. Some of the details involved with maximum 
likelihood estimation of MA processes are discussed in Appendix 2.1 of this chapter. 

Stationarity Restrictions for the 
Autoregressive Coefficients 

Now consider the pure autoregressive model 

P 

y , = «o + X a ‘ y>-i + £ i ( 2 - 19 ) 

;=) 

If the characteristic roots of the homogeneous equation of (2. If) all lie inside the 
unit circle, it is possible to write the particular solution as 

n 00 

y,=— 7— + 5> £ Hf (2- 20 ) 

1 1 0 

/=i 

where the a t = undetermined coefficients. 

Although it is possible to find the undetermined coefficients {a,}, we know that 
(2.20) is a convergent sequence so long as the characteristic roots of (2. 1 9) are inside 
the unit circle. To sketch the proof, the method of undetermined coefficients allows 
us to write the particular solution in the form of (2.20). We also know that the 
sequence {a,} will eventually solve the difference equation 

a i - a 1 o- - a 2 a^ 2 - • • • - a p a ( -_ p = 0 (2.2 1 ) 

If the characteristic roots of (2.21) are all inside the unit circle, the {a,-} sequence 
will be convergent. Although (2.20) is an infinite-order moving average process, the 
convergence of the MA coefficients implies that Ecq is finite. Thus, we can use 
(2.20) to check the three conditions for stationarity. Since ckq = 1, 

Ey, = Ey,_ s = a 0 /( 1 - Eaj) 

You should recall from Chapter 1 that a necessary condition of all characteristic 
roots to lie inside the unit circle is 1 - Ea t > 0. Hence, the mean of the sequence is 
finite and time-invariant. 

Var(y,) = E[(e t + + a 2 e ,_ 2 +a 3 £,_ 3 + ...) 2 ] = cr 2 Ecq 

and 

2 2 1 2 

Var(y,_ v ) = £[(e f _, + +<* 2 ^- 5 -! + a 2 £ t-s-2 + +■■•) 3 " 

Given that £a, 2 is finite, the variance is finite and time-independent. 

CovOv y,_ s ) = E[(£, + Q) £ t _ | + a 2 e,__ 2 + ...)( £ ,- s + 1 + a 2 e t-s-2 + ■••)] 

= o2(n; v + fV| ir s+ , + n 2 n sil + •••) 



IfaMiCMIWUH ■ : .**<*■**«« 



60 CHAPTER 2 STATIONARY TIME-SERIES MODELS 




Thus, the covariance between y, and y t _ s is constant and time-invariant for all r 
and t-s. Nothing of substance is changed by combining the AR (p) and MA(g) mod- 
els into the general ARMA(p, < 7 ) model: 

/' 

y, = “a + J2 + A : , 

f=l 



({ 

x < = ( 2 . 22 ) 
1=0 

If the roots of the inverse characteristic equation lie outside of the unit circle 
[i.e., if the roots of the homogeneous form of ( 2 . 22 ) lie inside the unit circle] and if 
the {*,} sequence is stationary, the {}>,) sequence will be stationary. Consider 



£ . fi'y £ 

+ t = L - + - - + ... (2.23) 



-S> '-E«, I 



With very little effort, you can convince yourself that the {y,} sequence satisfies 
the three conditions for stationarity. Each of the expressions on the right-hand side of 
( 2 . 23 ) is stationary as long as the roots of 1 - E ap are outside the unit circle. Given 
that {t,} is stationary, only the roots of the autoregressive portion of ( 2 . 22 ) determine 
whether the {y,} sequence is stationary. 



5. THE AUTO CORRELATIO N FUN CTION 

The autocovariances and autocorrelations of the type found in (2.18) serve as useful 
tools in the Box-Jenkins (1976) approach to identifying and estimating time-series mod- 
els. We illustrate by considering four important examples: the AR( 1 ), AR(2), MA( 1 ), and 
ARMA(1, 1) models. For the AR(1) model, y, = « 0 + « i.t 7 ,- 1 + -V (2.14) shows 

90 = <^/[l -(«i) 2 ] 

7v = ^ 2 («i ) v /[l -(«|) 2 ] 

Forming the autocorrelations by dividing each % by o () , we find that = I,/;, <q, 

fh = ( a \ ) 2 , • • •* P s = («i) $ - For an AR( I ) process, a necessary condition for stationarity 
is for | 0 | | < 1. Thus, the plot of /> s against .v — called the autocorrelation function 
(ACF) or correlogram — should converge to zero geometrically if the series is station- 
ary. If cii is positive, convergence will be direct, and if is negative, the autocorrela- 
tions will follow a dampened oscillatory path around zero. The first two graphs on the 
left-hand side of Figure 2.2 show the theoretical autocorrelation functions for = 0.7 
and q | = -0.7 respectively. Here, p 0 is not shown since its value is necessarily unity. 



The Autocorrelation Function of an AR(2) Process 



Now consider the more complicated AR(2) process y, = rqy^, + chv rl + e r We omit an 
intercept term (a 0 ) since it has no effect on the ACF. For the second-order process to be 
stationary, we know that it is necessary to restrict the roots of( 1 -t /,/„ - a-,L 2 ) to be out- 
side the unit circle. In Section 4, we derived the autocovariances of an ARMA(2, 1) 



















62 CHAPTER 2 STATIONARY TIME-SERIES MODELS 

By definition, the autocovariances of a stationary series are such that Ey t y^ s - 
Ey t - S y, = Ey,_0>,_ k _ s = y s . We also know that £e,y, = cP and Ee,y,_ s = 0. Hence, we 
can use the equations in (2.24) to form 



To = «|7l + «2T> + rr2 


(2.25) 


Tl = a \ To + a 2 7| 


(2.26) 


Tv = "| 7s- 1 + cl 2 Tv-2 


(2.27) 


Dividing (2.26) and (2,27) by y x) yields 




P\ = a \P() + "2 P\ 


(2.28) 


Ps = "lAv-l + a 2Ps- 2 


(2.29) 



We know that /\ x - I , so that from (2.28), p x - a x /{ 1 - c; 2 ). Hence, we can find all 
for s > 2 by solving the difference equation (2.29). For example, for ,v = 2, and ,v = 3, 

fh = («,) 2 /(l -t>2) + £'2 

Pi = a \[{a\) 2 K^ ~ a i) + " 2 ! + "2 fl |/(l ~ a i) 

Although the values of the p s are cumbersome to derive, we can easily character- 
ize their properties. Given the solutions for p 0 and p x , the key point to note is that the 
P s all satisfy the difference equation (2.29). As in the general case of a second-order 
difference equation, the solution may be oscillatory or direct. Note that the stationar- 
ity condition lor y, necessitates that the characteristic roots of (2.29) lie inside of the 
unit circle. Hence, the {/> v } sequence must be convergent. The correlogram for an 
AR(2) process must be such that /; () = I and that p x be determined by (2.28). These two 
values can be viewed as initial values for the second-order difference equation (2.29). 

The fourth panel on the left-hand side of Figure 2.2 shows the ACF for the 
process y, = 0.7y,_, - 0.49y,_ 2 + £,. The properties of the various p s follow directly 
from the homogeneous equation y, - 0.7y,_, + 0.49y,_ 2 = 0. The roots are obtained 
from the solution to 

r>= {0.7 ± [(-0.7)2 _ 4(0.49)] 1/2 ;./2 

Since the discriminant d= (-0.7) 2 4(0.49) is negative, the characteristic roots 

are imaginary so that the solution oscillates. However, since a-, = -0.49, the solution 
is convergent and the {y,} sequence is stationary. 

Finally, we may wish to find the autocovariances rather than the autocorrelations. 
Since we know all of the autocorrelations, if we can find the variance of y, (i.e„ 7 0 ), 
wc can find all of the other y s . To find y d , use (2.25) and note that p t = y/ y Q so that 

7o( 1 ~ <>\P\ - ‘hjh) = rP 

Substitution for p x and fh yields 

70 = var(. V) ) = [(1 - a 2 )/(l + a 2 )\ ; 2 

|(« l +fl 2 -l)(a 2 -«,-!) 










THE AUTOCORRELATION FUNCTION 63 



The Autocorrelation Function of an MA(1) Process 

Next consider the MA(1) process y, = e, + /?£,_[. Again, we can obtain the 
Yule-Walker equations by multiplying y, by each y t _ s and take expectations 

To = var (y,) = Ey,y, = E[{e, + 0£ M )( e / + = 0 + 0 2 )<? 2 

T| = Ey,y,_ x = E[(e, + (3e t _ x ){e,_ x + 0e,_ 2 )} = 0cr 2 



Is = Ey,y,-,s = E[(e, + /fe,_, )(£,_* + 1 )] = o for all £ > 1 

Hence, dividing each y s by To. it can be immediately seen that the ACF is sim- 
ply p {) = I , = fl/( I + ft 2 ), and p s = 0 for all s > 1 . The third graph on the left-hand 

side of Figure 2.2 shows the ACF for the MA( i) processy, = e t - 0.7e,_|. As an exer- 
cise, you should demonstrate that the ACF for an MA(2) process has two spikes and 
then cuts to zero. 

The Autocorrelation Function of an ARWIA(1, 1) Process 

Finally, let y, = a x y,_ x + e, + P\£,-\. Using the now-familiar procedure, we find the 
Yule-Walker equations 

Ey,y t = 0\Ey,_\)’, + Ee,y r + f# To = fl lTi + cr 2 + /? ! (o , +/?i )cr 2 (2.30) 

Ev,y ,_ | = a x Ey,. _ x y,_ x + Ee,y,_ x + /i x E£,_ x y,_ x => Ti = fl|To + 0\° 2 ( 2 - 31 ) 

Ey,y, 2 = a x Ev, x y,_ 2 + E£ t y,. 2 + 0 X E£,_ x y,_ 2 => J 2 = a x y x (2.32) 



Ey,y, . s = a | Ey,_ x y,_ s + Ez,y ,_ s + /j, Ee r _ x y,_ s =P % = a,Tj-i 
Solving (2.30) and (2.31) simultaneously for Jq and 7 [ yields 



\+(l x +2a x p x 2 

a 

(I -a?) 



Hence, 



(\+a x 0 x ) (a x + P x ) , 

7t = x — a~ 

' (l -a?) 



— ^ a i 00 ( fi i PQ 

(1 + p 2 + 2a x 0 x ) 



and /) s . = a x p s -\ for all s > 2. 

Thus, the ACF for an ARMA( 1,1) process is such that the magnitude of p x depends 
on both a x and 6 X . Beginning with this value of p x , the ACF of an ARMA(1, 1) process 








64 CHAPTER 2 STATIONARY TIME-SERIES MODELS 

looks like that of the AR(1) process. If 0 < < 1, convergence will be direct, and if — 1 

< 0| < 0, the autocorrelations will oscillate. The ACF for the function y, = -0.7y,_, + e, 
- 0.7£,_, is shown as the last graph on the left-hand side of Figure 2.2. The top portion 
of Worksheet 2.2 derives these autocorrelations. 

We leave you with the exercise of deriving the correlogram of the ARMA(2, 1) 
process used in Worksheet 2. 1 . You should be able to recognize the point that the cor- 
relogram can reveal the pattern of the autoregressive coefficients. For an ARMA (p, 
q) model beginning after lag q , the values of the will satisfy 

Pi = a \Pi-\ + a 2 /’i -2 + 



WORKSHEET Z.Z 



CALCULATION OF THE PARTIAL AUTOCORRELATIONS OF 

y, = -0-7y,-i + e,-0.7e,_, 

Step 1: Calculate the autocorrelations. Use (2.34) to calculate y\ as 

(I+0.49X— 0.7 — 0.7) , ww ,„ 

Pi = = -0.8445 

1 + 0.49 + 2(0.49) 

The remaining autocorrelations decay at the rate />, = -0.7p. ( so that 
Pi = 0.59 1 , p 3 = -0.4 1 4, /I, = 0.290, p 5 = -0.203, p h = 0. 1 42, p, = -0.099, p 8 = 0.07 

Step 2: Calculate the first two partial autocorrelations using (2.35) and (2.36). I lence 
(A,, =/>, =-0.8445 

( hi = [0.59 1 - (-O.X445)-]/[ I - (-0.8445)-] = -0.426 

Step 3: Construct all remaining <|)„. iteratively using (2.37). To find fa, note that fa = 
<i>w - faO\ \ = -1.204 and form 



( 2 )( 2 

°yy = Py ~ jf’y j 1 - Y2<Ih jpj 

\ j- 1 Jl j i 




= [-0.414 -(-1.204X0.591 ) - (-0.426)( -0.8445)]/] I - (-1 .204)(~0.8445). 

-(-0.426)(0.59l)| 

0 -0.262 



Similarly, to find <^44 use 

3 

P4~^2fal>4~i 

J I 

Since 0 y - fa - ‘Py^Fii-r it follows that </y | = —1.3 15, and yq = —0,74. Hence 

fa = -0. 1 73 

If we continue in this fashion, it is possible to demonstrate that fa = -0.1 17, fa = 
-0.081, fa = -0.056 and fa = -0.039 





THE PARTIAL AUTOCORRELATION FUNCTION 65 



The previous p values can be treated as initial conditions that satisfy the 
Yule- Walker equations. For these lags, the shape of the ACF is determined by the 
characteristic equation. 



6 . T HE PARTIAL AUTOCORRELATION FUNCTION 

In an AR( I ) process, y, andy,_ 2 are correlated even though y ,_ 2 does not directly appear 
in the model. The correlation between y t and y,_ 2 (he., fh) is equal to the correlation 
between y, and y t _ , (i.e., p\) multiplied by the correlation between y,_ , and y ,_ 2 (i.e., p\ 
again) so that /y = (/> t ) 2 . It is important to note that all such indirect correlations are 
present in the ACF of any autoregressive process. In contrast, the partial autocorrela- 
tion between y, and y f _ s eliminates the effects of the intervening values y,_\ through 
y,_ v+) . As such, in an AR(1) process the partial autocorrelation between y, and y ,_ 2 is 
equal to zero. The most direct way to find the partial autocorrelation function is to first 
form the series \y * } by subtracting the mean of the series (i.e., //,) from each observa- 
tion to obtain y* = y, - //.. Next, form the first-order autoregression 

v, =0ii.v,_| + c, 

where e , is an error term. 

Here the symbol !<.',} is used since this error process may not be white noise. 

Since there are no intervening values, fa is both the autocorrelation and the partial 
autocorrelation between y t and y, | . Now form the second-order autoregression equation 

* * ( * 

V, =<^2|.V/-l +®22- v <-2 + e, 

Here <fa is the partial autocorrelation coefficient between y, and y, 2 . In other 
words, fa is the correlation between y, and y,__ 2 controlling for (i.e., "netting out") 
the effect of y ,_ ,. Repeating this process for all additional lags .? yields the partial 
autocorrelation function (PACT). In practice, with sample size T, only 774 lags are 
used in obtaining the sample PACF. 

Since most statistical computer packages perform these transformations, there 
is little need to elaborate on the computational procedure. However, it should he 
pointed out that a simple computational method relying on the so-called 
Yule - Walker equations is available. One can form the partial autocorrelations from 
the autocorrelations as 

'A i - P\ (2.35) 

fa =(fh — /zf ) /( 1 -/>}) ' (2.36) 

and for additional lags. 




where fa = fa b , - fafa .\ ..v jJ ~ 1.2,3 x-\ . 



-rife 



66 CHAPTER 2 STATIONARYTIME-SERIES MODELS 



For an AR(p) process, there is no direct correlation between y, and y,_ f for.? > p. 
Hence, for j > p, all values of <p ss will be zero and the PACF for a pure AR(p) process 
should cut to zero for all lags greater than p. This is a useful feature of the PACF that 
can aid in the identification of an AR(p) model. In contrast, consider the PACF for the 
MA(1) process: y, = £,+ fk ,_ ,. As long as j3 * -I, we can write y,/ (I + 0L) = £„ 
which we know has the infinite-order autoregressive representation 

y, - (1y H i + P 2 y,-2 ~ 3 + ■ • • = e, 

As such, the PACF will not jump to zero since y, will be correlated with all of its 
own lags. Instead, the PACF coefficients exhibit a geometrically decaying pattern. If 
0 < 0, decay is direct and if 0 > 0, the PACF coefficients oscillate. 

Worksheet 2.2 illustrates the procedure used in constructing the PACF for the 
ARMA( 1 , 1 ) model shown in the fifth panel on the right-hand side of Figure 2.2: 

>7 = — 0 . 7 . 17-1 + £ r ~ 0-7e,_| 

First, calculate the autocorrelations. Clearly, P{) = I ; use equation (2.34) to calcu- 
late as p, = -0.8445. Thereafter, the ACF coefficients decay at the rate /y = (-0.7 )/y._, 
tor i > 2. Using (2.35) and (2.36), we obtain <j> u = -0.8445 and d> 22 = -0.42 5 0. All 
subsequent <p ss and <j> SJ - can be calculated from (2.37) as in Worksheet 2.2. 

More generally, the PACF of a stationary ARMA(/?, q) process must ultimately 
decay toward zero beginning at lag p. The decay pattern depends on the coefficients of 
the polynomial (1 + (3 X L + /) 2 Z, 2 + . . . + /3 ? W). Table 2. 1 summarizes some of the proper- 
ties of the ACF and PACF for various ARMA processes. Also, the right-hand side graphs 
ot Figure 2.2 show the partial autocorrelation functions of the five indicated processes. 

For stationary processes, the key points to note are the following: 

1. The ACF of an ARMAf/;, q) process will begin to decay after lag q. After 
lag q , the coefficients of the ACF (i.e., the Pi ) will satisfy the difference 



Table 2.1 Properties of the ACF and PACF 



Process 


ACF 


PACF 


White noise 


All p s = 0(s = 0) 


All ,h ss = 0 


AR(1): a, > 0 


Direct exponential decay: /> s = af 


Oil = Pv 4>ss = 0 for s > 2 


AR(1): a, < 0 


Oscillating decay: p s = af 


©11 = Pv <i>ss = 0 for s > 2 


AR{p) 


Decays toward zero. Coefficients 
may oscillate. 


Spikes through lag p. All q ss = 0 
for s> p. 


MA(1): ,y> 0 


Positive spike at lag 1. p s = 0 for s > 2 


Oscillating decay: <p^ > 0. 


MA(1): 3 < 0 


Negative spike at lag 1. p s = 0 for s > 2 


Geometric decay: < 0. 


ARMA(1, 1): 


Exponential decay beginning at lag 1. 


Oscillating decay beginning at 


a, > 0 


Sign p, = signfa, + fi) 


lag 1, <j)^ — py 


ARMAfl, 1): 


Oscillating decay beginning at lag 1. 


Exponential decay beginning at 


a, < 0 


Sign p, = sign(a, + 8) 


lag 1. 4> n = p, and sign(© ss ) 
= signed. 


ARMA (p, q) 


Decay (either direct or oscillatory) 
beginning at lag q. 


Decay (either direct or oscil- 
latory) after lag p. 



SAMPLE AUTOCORRELATIONS OF STATIONARY SERIES 67 



equation (/>,• = a xPi _\ + + + a pPi-p)- Since the characteristic roots 

are inside the unit circle, the autocorrelations will decay after lag q. 
Moreover, the pattern of the autocorrelation coefficients will mimic that 
suggested by the characteristic roots. 

2. The PACF of an ARMA(/?, q ) process will begin to decay after lag p. After 
lag/?, the coefficients of the PACF (i.e., the <fi ss ) will mimic the ACF coef- 
ficients from the model _>y /(I + p\L + /^L 2 + ... + PqLH). 

We can illustrate the usefulness of the ACF and PACF functions using the model 
y, = a 0 + 0.7)7-, + e r If we compare the top two graphs in Figure 2.2, the ACF shows 
the monotonic decay of the autocorrelations while the PACF exhibits the single spike 
at lag 1. Suppose that a researcher collected sample data and plotted the ACF and 
PACF functions. If the actual patterns compared favorably to the theoretical patterns, 
the researcher might try to estimate data using an AR(1) model. Correspondingly, if 
the ACF exhibited a single spike and the PACF exhibited monotonic decay (see the 
third graph for the model y, = £,- 0.7£ M ) the researcher might try an MA(1) model. 



7. SAMPLE AUTOCORRELATIONS OF 

STATIONARY SERIES ___ 

In practice, the theoretical mean, variance, and autocorrelations of a series are 
unknown to the researcher. Given that a series is stationary, we can use the sample 
mean, variance and autocorrelations to estimate the parameters of the actual data 
generating process. Let there be T observations labeled y { through y T . We can let y, 
a 2 , and r s be estimates of //, o 2 , and p s respectively where 1 

T 

y = (l /T)J2y, . (2.38) 



a 2 = (\ /T)J2(y,-y) 2 



and for each value of s 



E (y,- >00’, 

r, = i2i ± L 7 (2.40) 

EO’,-50 2 

r=l 

The sample autocorrelation function [i.e., the ACF derived from (2.40)] and the 
sample PACF can be compared to various theoretical functions to help identify the 
actual nature of the data generating process. Box and Jenkins (1976) discuss the dis- 
tribution of the sample values of r s under the null thaty, is stationary with normally 
distributed errors. Allowing var(ry) to denote the sampling variance of r : , they obtain 

Var(ry) = T~ l for 5=1 

= 7 — '(l + 2 E/J ) for ‘ v > 1 



(2.41) 



! I ftIWIffP 




68 CHAPTER 2 STATIONARY TIME-SERIES MODELS 

if the true value of r s = 0 [i.e., if the true data-generating process is an MA(s-l) 
process]. Moreover, in large samples (i.e., for large values of T ), r s will be normally 
distributed with a mean equal to zero. For the PACF coefficients, under the null 
hypothesis of an AR(p) model (i.e., under the null that all <f> p+ip+i are zero), the vari- 
ance of the (?>!,+('!, +i is approximately \/T. 

In practice, we can use these sample values to form the sample autocorrelation 
and partial autocorrelation functions and test for significance using (2.41). For exam- 
ple, if we use a 95 percent confidence interval (i.e., two standard deviations), and the 
calculated value of r , exceeds 27" 1/2 , it is possible to reject the null hypothesis that the 
first-order autocorrelation is not statistically different from zero. Rejecting this 
hypothesis means rejecting an MA(s - 1) = MA(0) process and accepting the alterna- 
tive q > 0. Next, try s = 2; var (r 2 ) is: (l+2r | 2 )/r, Ifr, is 0.5 and Tis 100, the variance 
of r 2 is 0.015 and the standard deviation is about 0.123. Thus, if the calculated value 
of r 2 exceeds 2(0. 1 23), it is possible to reject the hypothesis r 2 = 0. Here, rejecting the 
null means accepting the alternative that q > 1. Repeating for the various values of s 
is helpful in identifying the order to the process. The maximum number of sample 
autocorrelations and partial autocorrelations to use is typically set equal to 774. 

Within any large group of autocorrelations, some will exceed two standard devi- 
ations as a result of pure chance even though the true values in the data-generating 
process are zero. The ^-statistic can be used to test whether a group of autocorrela- 
tions is significantly different from zero. Box and Pierce (1970) used the sample auto- 
correlations to form the statistic 

Q= T td 

k = 1 

Under the null hypothesis that all values of r k = 0, 0 is asymptotically ,y 2 distrib- 
uted with s degrees of freedom. The intuition behind the use of the statistic is that 
high sample autocorrelations lead to large values of 0. Certainly, a white-noise 
process (in which all autocorrelations should be zero) would have a 0 value of zero. 
If the calculated value of 0 exceeds the appropriate value in a x~ table, we can reject 
the null of no significant autocorrelations. Note that rejecting the null means accept- 
ing an alternative that at least one autocorrelation is not zero. 

A problem with the Box-Pierce 0-statistic is that it works poorly even in mod- 
erately large samples. Ljung and Box (1978) report superior small sample perform- 
ance for the modified 0-statistic calculated as 

.V 

Q = T{T + 2)Y J r\l(T~k) (2.42) 

k = 1 

If the sample value of 0 calculated from (2.42) exceeds the critical value of x 2 with 
s degrees of freedom, then at least one value of r k is statistically different from zero at the 
specified significance level. The Box-Pierce and Ljung-Box 0-statistics also serve as a 
check to see if the residuals from an estimated ARMA(p, q) model behave as a while- 
noise process. However, when the y correlations from an estimated ARMA(p, q) model 
are formed, the degrees of freedom are reduced by the number of estimated coefficients. 



SAMPLE AUTOCORRELATIONS OP STATIONARY SERIES 69 

Hence, using the residuals of an ARMA(p, q) model, 0 has a with s - p- q degrees 
of freedom (if a constant is included, the degrees of freedom are s - p - q - 1 ). 

Model Selection Criteria 

One natural question to ask of any estimated model is: How well does it fit the data? 
Adding additional lags for p and/or q will necessarily reduce the sum of squares of 
the estimated residuals. However, adding such lags entails the estimation of addi- 
tional coefficients and an associated loss of degrees of freedom. Moreover, the inclu- 
sion of extraneous coefficients will reduce the forecasting performance of the fitted 
model. As discussed in some detail in Appendix 2.2 of this chapter, there exist vari- 
ous model selection criteria that trade off a reduction in the sum of squares of the 
residuals for a more parsimonious model. The two most commonly used model 
selection criteria are the Akaike Information Criterion (A 1C) and the Schwartz 
Bayesian Criterion (SBC). Although there are several different ways to report the cri- 
teria (as illustrated by question 10 at the end of this chapter), all will select the same 
model. In the text, we will use the following formulas 

A1C = T In (sum of squared residuals) + 2 n 
SBC = Tin (sum of squared residuals) + n ln(7j 

where: n = number of parameters estimated (p + q + possible constant term) 

T = number of usable observations 

When you estimate a model using lagged variables, some observations are lost. 
To adequately compare the alternative models, T should be kept fixed. Otherwise, you 
will be comparing the performance of the models over different sample periods. 
Moreover, decreasing 7" has direct effect of reducing the AIC and the SBC; the goal 
is not to select a model because it has the smallest number of usable observations. For 
example, with 100 data points, estimate an AR( I ) and an AR(2) using only the last 98 
observations in each estimation. Compare the two models using T = 98. 

Ideally, the AIC and SBC will be as small as possible (note that both can be neg- 
ative). As the fit of the model improves, the AIC and SBC will approach --\j. We can 
use these criteria to aid in selecting the most appropriate model; model A is said to fit 
better than model B if the AIC (or SBC) for A is smaller than for model B. In using 
the criteria to compare alternative models, we must estimate them over the same sam- 
ple period so that they will be comparable. For each, increasing the number of regres- 
sors increases n but should have the effect of reducing the sum of squared residuals. 
Thus, if a regressor has no explanatory power, adding it to the model will cause both 
the AIC and SBC to increase. Since ln(7) will be greater than 2, the SBC will always 
select a more parsimonious model than will the AIC; the marginal cost of adding 
regressors is greater with the SBC than with the AIC. 

Of the two criteria, the SBC has superior large sample properties. Let the true order 
of the data generating process be (p*, q*) and suppose that we use the AIC and SBC to 
estimate all ARMA models of order ( p , q) where p > p* and q > q*. Both the AIC and 
the SBC will select models of orders greater than or equal to (/>*, q*) as the sample size 
approaches infinity. However, the SBC is asymptotically consistent while die AIC is 






70 CHAPTER 2 STATIONARY TIME-SERIES MODELS 

biased toward selecting an overparameterized model. However, in small samples the 
AIC can work better than the SBC. You can be quite confident in your results if both 
the AIC and the SBC select the same model. If they select different models, you need 
to proceed cautiously. Since SBC selects the more parsimonious model, you should 
check to determine if the residuals appear to be white noise. Since the AIC can select 
an overparameterized model, the /-statistics of all coefficients should be significant at 
conventional levels. A number of other diagnostic checks that can be used to compare 
alternative models are presented in Sections 8 and 9. Nevertheless, it is wise to retain a 
healthy skepticism of your estimated models. With many data sets, it is just not possi- 
ble to find the one model that clearly dominates all others. There is nothing wrong with 
reporting the results and the forecasts using alternative estimations. 

Estimation of an AR(1) Model 

Let us use a specific example to see how the sample autocorrelation function and partial 
autocorrelation function can be used as an aid in identifying an ARMA model. A com- 
puter program was used to draw 100 normally distributed random numbers with a theo- 
retical variance equal to unity. Call these random variates s,, where / runs from I to 1 00. 
Beginning with / = 1, values of v, were generated using the formula u, = 0.7)-,, + £ and 
the initial condition v 0 = 0. Note that the problem of nonstationarity is avoided'since the 
mmal condition is consistent with long-run equilibrium. The upper-left-hand graph of 
Figure 2.3 shows the sample correlogram and the upper-right-hand graph shows the sam- 
ple PACF. You should take a minute to compare the ACF and PACF to those of the the- 
oretical processes shown in Figure 2.2. 

In practice, we never know the true data-generating process. However, suppose 
we were presented with these 100 sample values and were asked to uncover the true 
process. The first step might be to compare the sample ACF and PACF to those of the 
various theoretical models. The decaying pattern of the ACF and the single lame spike 
at lag 1^ in the sample PACF suggests an AR(1) model. The first three autocorrelations 
ate r | - 0.74. r 2 = 0.58, and r 3 = 0.47 (which are somewhat greater than the theoreti- 
cal values of 0.7, 0.49 ( 0.72 = 0.49), and 0.343. In the PACF, there is a sizable spike 
oi 0.74 at lag 1 , and all other partial autocorrelations (except for lag 1 2) are very small. 

Under the null hypothesis of an MA(0) process, the standard deviation of r, is 
T - - 0.1. Since the sample value of r, = 0.74 is more than seven standard devia- 
tions from zero, we can reject the null hypothesis that /-, equals 0. The standard devi- 
ation of r 2 is obtained by applying (2.41) to the sampling data, where ,v = 2: 

Var(r 2 ) = (1 + 2(0.74) 2 )/ 1 00 = 0.021 

Since (0.02I) |/ 2 = 0.1449, the sample value of r 2 is more than three standard 
deviations from zero; at conventional significance levels, we can reject the null 
hypothesis that r 2 equals zero. We can similarly test the significance of the other val- 
ues of the autocorrelations. 

As you can see in the second panel of the figure, other than </>,,, all partial 
autocorrelations (except for lag 12) are less than 2 r !/ 2 = 0 2 . The decay of the 
ACF and the single spike ol the PACF give the strong impression of a first-order 




SAMPLE AUTOCORRELATIONS OF STATIONARY SERIES 71 




FIGURE 2.3 ACF and PACF forTwo Simulated Processes 



autoregressive model. Nevertheless, if we did not know the true underlying 
process, and happened to be using monthly data, we might be concerned with the 
significant partial autocorrelation at lag 12. After all, with monthly data we might 
expect some direct relationship between y, and y,_ p . 

Although we know that the data was actually generated from an AR(1) process, 
it is illuminating to compare the estimates of two different models. Suppose we esti- 
mate an AR(1) model and also try to capture the spike at lag 12 with an MA coeffi- 
cient. Thus, we can consider the two tentative models 

Mode! T. y t = + e, 

Model 2: y, = a , v f _, + e, + p xi e,_ 12 . 

Table 2.2 reports the results of the two estimations. The coefficient of model l 
satisfies the stability condition | '[ < 1 and has a low standard error (the associated 

/-statistic for a null of zero is more than 12). As a useful diagnostic check, we plot the 
correlogram of the residuals of the fitted model in Figure 2.4. The ^-statistics for 
these residuals indicate that each one of the autocorrelations is less than 2 standard 
deviations from zero. The Ljung-Box (7-statistics of these residuals indicate that as 
a group, lags 1 through 8, 1 through 16, and 1 through 24 are not significantly differ- 
ent from zero. This is strong evidence that the AR(1 ) model “fits” the data well. After 
all, if residual autocorrelations were significant, the AR(1) model would not utilize 



mieii ib-j iw-ft 







72 CHAPTER 2 STATION ARY TIM E-SERIES MODELS 




Table 2.2 Estimates of an AR(1) Model 





Model 1 
Yt = a i Yi-i + £, 


Model 2 

Yt- a lk(-1 + + Pt2 e t- 12 


Degrees of freedom 


99 


98 


Sum of squared residuals 


85.21 


85.17 


Estimated a. 


0.7910 


0.7953 


(standard error) 


(0.0622) 


(0.0638) 


Estimated 0 




-0.033 


(standard error) 




(0.1134) 


AIC; SBC 


AIC = 441.9; SBC = 444.5 


AIC = 443.9; SBC = 449.1 


Ljung-Box Q-statistics for 


0(8) = 6.43 (0.490) 


Q(8) = 6.48 (0.485) 


the residuals (significance 


Q(16) = 15.86 (0.391) 


Q(16) = 15.75 (0.400) 


level in parentheses) 


0(24) = 21.74 (0.536) 


Q(24) = 21.56 (0.547) 




0 5 10 15 20 

FIGURE 2.4 ACF of Residuals from Model 1 



all available information concerning movements in the (v,J sequence. For example, 
suppose we wanted to forecast v /+1 conditioned on all available information up to and 
including period /. With model 1, the value of v ;+l is v /+i = a |V, + s, + |. Hence, the 
forecast from model 1 is a\ y,. If the residual autocorrelations had been significant, 
this forecast would not capture all of the available information set. 

Examining the results for model 2, note that both models yield similar estimates 
for the first-order autoregressive coefficient and the associated standard error. 
However, the estimate for /3 \ 2 ' s of poor quality; the insignificant t value suggests 
that it should be dropped from the model. Moreover, comparing the A1C and the 
SBC values of the two models suggests that any benefits of a reduced sum of 
squared residuals is overwhelmed by the detrimental effects of estimating an addi- 
tional parameter. All of these indicators point to the choice of model 1. 

Exercise 7 at the end of this chapter entails various estimations using this data 
set. In this exercise you are asked to show that the AR( 1 ) model performs better than 
some alternative specifications. It is important that you complete this exercise. 



SAMPLE AUTOCORRELATIONS OF STATIONARY SERIES 73 

Estimation of an ARMA(1, 1) Model 

A second {y,} sequence was constructed to illustrate the estimation of an 
ARMA(I, 1). Given 100 normally distributed values of {e,}, 100 values of {y,} 
were generated using 

v, = -0.7v,_| + 5 , - 0.7f,..| 

where y (l and £q were both set equal to zero. 

Both the sample ACF and the PACF from the simulated data (see the second set 
of graphs in Figure 2.3) are roughly equivalent to those of the theoretical model 
shown in Figure 2.2. However, if the true data-generating process were unknown, the 
researcher might be concerned about certain discrepancies. An AR(2) model could 
yield a sample ACF and PACF similar to those in the figure. Table 2.3 reports the 
results of estimating the data using the following three models: 

Model 1 : v, = + s. 

Model 2: v, = + s, + , 

Model .3: y, = + «2.'V-2 + e t 

In examining Table 2.3, notice that all of the estimated values of a, are highly 
significant; each of the estimated values is at least eight standard deviations from 
zero. It is clear that the A R( 1 ) model is inappropriate. The ^-statistics for model I 
indicate that there is significant autocorrelation in the residuals. The estimated 
ARMA(1, 1 ) model does not suffer from this problem. Moreover, both the AIC and 
the SBC select model 2 over model I. 

The same type of reasoning indicates that model 2 is preferred to model 3. Note 
that for each model, the estimated coefficients are highly significant and the point 
estimates imply convergence. Although the (7-statistic at 24 lags indicates that these 



Table 2.3 Estimates of an ARMA (1,1) Model 





Estimates 1 


Q-Statistics 2 


AIC/SBC 3 


Model 1 


a,; -0.835 (.053) 


Q(8) = 26.19 (.000) 


AIC = 496,5 






Q{24) = 41.10 (.001) 


SBC = 499.0 


Model 2 


a,: -0.679 (.076) 


0(8) = 3.86 (.695) 


AIC = 471.0 




-0.676 (.081) 


0(24) = 14.23 (.892) 


SBC = 476.2 


Model 3 


ay. -1.16 (.093) 


0(8) = 11.44 (.057) 


AIC = 482.8 




ay. -0.378 (.092) 


0(24) = 22.59 (.424) 


SBC = 487.9 


Notes: 



1 Standard errors in parentheses. 

2 Ljung-Box G-statistics of the residuals from the fitted model. The 
significance levels are in parentheses. 

3 For comparability, the AIC and SBC values are reported for estimations 
that used only observations 3 through 100. If the AR(1) is estimated using 
99 observations, the AIC and SBC are 502.3 and 504.9, respectively. If the 
ARMAO, 1 ) is estimated using 99 observations, the AIC and SBC are 
476.6 and 481.1, respectively. 



1 

1 

S 











74 CHAPTER 2 STATIONARY TIME-SERIES MODELS 

two models do not suffer from correlated residuals, the (2-statistic at 8 lags indicates 
serial correlation in the residuals of model 3. Thus, the AR(2) model does not capture 
short-term dynamics as well as the ARMA(1, 1) model. Also note that the A1C and 
SBC both select model 2. 

Estimation of an AR(2) IVlodei 

A third data series was simulated as 

y, = 0.7 y M - 0.49 v ,_2 + e, 

The estimated ACF and the PACF of the series follow: 

ACF 

Lags: 



1-12: 


0.466 


-0.161 


-0.322 


-0.108 


-0.052 


-0.165 




-0.010 


0.128 


0.180 


0.034 


-0.087 


-0.113 


13-24: 


-0.164 


-0.058 


0.115 


0.254 


0.046 


-0.175 




0.150 


0.010 


0.032 


-0.089 


-0.046 


0.052 


PACF 


1-12: 


0.466 


-0.482 


0.023 


0.045 


-0.253 


-0.121 




0.101 


0.037 


-0.076 


0.023 


-0.020 


-0.139 


13-24: 


-0.167 


0.207 


0.007 


0.085 


-0.216 


0.013 




-0.022 


-0.032 


0.015 


-0.061 


0.038 


-0.184 



Note the large autocorrelation at lag 16 and the large partial autocorrelations at 
lags 14 and 17. Given the way the process was simulated, the presence of these auto- 
correlations is due to nothing more than chance. However, an econometrician 
unaware of the actual data-generating process might be concerned about these auto- 
correlations. The coefficients of the AR(2) mode! are estimated as 

Coefficient Estimate Standard Error /-Statistic Significance Level 

o, 0.692389807 0.089515769 7.73484 0.00000000 

a 2 -0.480874620 0.089576524 -5.36831 0.00000055 

AIC = 2 1 9.87333 SBC = 225.04327 




Overall, the model appears to be adequate. However, the two AR(2) coefficients 
are unable to capture the correlations at very long lags. For example, the partial auto- 
correlations of the residuals for lags 14 and 17 are both greater than 0.2 in absolute 
value. The calculated Ljung-Box statistic for 16 lags is 24.6248 (which is significant 
at the 0.038 level). At this point, it might be tempting to 117 to model the correlation at 
lag 16 by including the moving average term /?i 6 fr,_| 6 . Such an estimation results in 2 

Coefficient Estimate Standard Error /-Statistic Significance Level 

a\ 0.716681247 0.091069451 7.86961 0.00000000 

a 2 -0.464999924 0.090958095 -5.11224 0.00000165 

d l6 0.305813568 0.109936945 2.78172 0.00652182 

AIC = 213.40055 SBC = 221.15545 






SAMPLE AUTOCORRELATIONS OF STATIONARY SERIES 75 



All estimated coefficients are significant and the Ljung-Box (9-statistics for the 
residuals are all insignificant at conventional levels. In conjunction with the fact that 
the AIC and SBC both select this second model, the researcher unaware of the true 
process might be tempted to conclude that the data-generating process includes a 
moving average term at lag 16. 

A useful model check is to split the sample into two parts. If a coefficient is pres- 
ent in the data-generating process, its influence should be seen in both subsamples. If 
the simulated series is split into two parts, the ACF and PACF using observations 50 
through 100 follow: 



ACF 

Lags 

1-12: 


0.460 


-0.207 


-0.280 


0.035 


0.095 


-0.153 




-0.133 


0.102 


0.181 


0.027 


-0.009 


0.009 


13-24: 


-0.058 


-0.088 


0.042 


0.205 


0.064 


-0.162 




-0.184 


-0.050 


-0.033 


-0.145 


-0.107 


-0.036 


PACF 

1-12: 


0.460 


-0.531 


0.193 


0.063 


-0.197 


-0.130 




0.234 


-0.078 


0.004 


0.065 


0.154 


-0.257 


13-24: 


0.026 


0.146 


0.038 


0.004 


-0.054 


-0.009 




-0.137 


-0.076 


-0.034 


-0.083 


-0.029 


-0.220 



As you can see, the size of the partial autocorrelations at lags 14 and 17 is 
diminished. Now, estimating a pure AR(2) model over this second part of the sam- 
ple yields 

Coefficient Estimate Standard Error /-Statistic Significance Level 



0.713855785 

-0.537843744 



0.120541523 

0.120420318 



5.92207 

-4.46639 



0.00000031 

0.00004687 



(9(8) = 7.83, significance level 0.251 
(9(16) = 15.93, significance level 0.317 
(9(24) = 26.06, significance level 0.249 

All estimated coefficients are significant, and the Ljung-Box (9-statistics do not 
indicate any significant autocorrelations in the residuals. In fact, this model does cap- 
ture the actual data-generating process quite well. In this example, the large spurious 
autocorrelations of the long lags can be eliminated by changing the sample period. 
Thus, it is hard to maintain that the correlation at lag 16 is meaningful. Most sophis- 
ticated practitioners warn against trying to fit any model to the very long lags. As you 
can infer from (2.41), the variance of r s can be sizable when s is large. Moreover, in 
small samples, a few “unusual” observations can create the appearance of significant 
autocorrelations at long lags. Since econometric estimation involves unknown data- 
generating processes, the more general point is that we always need to be wary of our 
estimated model. Fortunately, Box and Jenkins (1976) established a set of procedures 
that can be used to check a model’s adequacy. 






76 CHAPTER 2 STATION ARY TIM E-SERIES MODELS 




8. BOX-JENKIIMS MODEL SELECTION 

The estimates of the AR(1), ARMA(1 , I) and AR(2) models in the previous section 
illustrate the Box-Jenkins (1976) strategy for appropriate model selection. Box and 
Jenkins popularized a three-stage method aimed at selecting an appropriate model for 
the purpose of estimating and forecasting a univariate time series. In the identifica- 
tion stage, the researcher visually examines the time plot of the scries, the autocor- 
relation function, and the partial correlation function. Plotting the time path of the 
{y t } sequence provides useful information concerning outliers, missing values, and 
structural breaks in the data. Nonstationary variables may have a pronounced trend or 
appear to meander without a constant long-run mean or variance. Missing values and 
outliers can be corrected at this point. At one time, the standard practice was to first- 
difference any series deemed to be nonstationary. Currently, a large literature is 
evolving that develops formal procedures to check for nonstationarity. We defer this 
discussion until Chapter 4 and assume that we are working with stationary data. A 
comparison of the sample ACF and PACF to those of various theoretical ARMA 
processes may suggest several plausible models. In the estimation stage, each of the 
tentative models is fit and the various a i and ,/J ( coefficients are examined. In this sec- 
ond stage, the estimated models are compared using the following criteria. 

Parsimony 

A fundamental idea in the Box-Jenkins approach is the principle of parsimony. 
Parsimony (meaning sparseness or stinginess) should come as second nature to econ- 
omists. Incorporating additional coefficients will necessarily increase fit (e.g., the 
value of R- will increase) at a cost of reducing degrees of freedom. Box and Jenkins 
argue that parsimonious models produce better forecasts than overparametcrized 
models. A parsimonious model fits the data well without incorporating any needless 
coefficients. The aim is to approximate the true data-generaling process but not to pin 
down the exact process. The goal of parsimony suggested eliminating the MA(I2) 
coefficient in the simulated AR(1) model shown earlier. 

In selecting an appropriate model, the econometrician needs to be aware that sev- 
eral different models may have similar properties. As an extreme example, note that 
the AR(1) model v, = 0.5p,_i + 6, has the equivalent infinite-order moving-average 
representation of y, = s, + 0.56,_) + 0.256,_ 2 + 0.1256, ,.3 + 0.06256,.^ + .... In most 
samples, approximating this MA(oo) process with an MA(2) or MA(3) model will 
give a very good fit. However, the AR( 1 ) model is the more parsimonious model and 
is preferred. As a test, you should show that this AR( 1 ) model has the equivalent rep- 
resentation ofy, = 0.25y,_2 + 0.56 ,_i + 6,. 

Also, be aware of the common factor problem. Suppose we wanted to fit the 
ARMA(2, 3) model 

( 1 - ct\L - a 2 L 2 )y, = ( 1 + (i x L + /W- + >i 3 U)q (2.43) 

Suppose that (1 - a^L - « 2 A 2 ) and (1 + t\L + foL- + can each be factored 
as ( 1 + cL)( 1 + at) and ( 1 +cL)(l + b]L + /; 2 f. 2 ), respectively. Since (I + cL) is a com- 
mon factor to each, (2.43) has the equivalent, but more parsimonious, form 2 
(1 +«/.)>’, = (! +/),T + 6 2 L 2)6, 



(2.44) 



BQX-JENKINS MODEL SECTION 77 



If you passed the last quiz, you know that (1 - 0.25 L 2 )y, = (1 + 0.5 L)t, is equiv- 
alent to (1 + 0.5L)(1 - 0.5 L)y, = (I + 0.5L)6, so that y, = 0.5y,_| + £,. In practice, the 
polynomials will not factor exactly. However, if the factors are similar, you should try 
a more parsimonious form. 

In order to ensure that the model is parsimonious, the various and /?; should all 
have /-statistics of 2.0 or greater (so that each coefficient is significantly different 
from zero at the 5% level). Moreover, the coefficients should not be strongly corre- 
lated with each other. Highly collinear coefficients are unstable; usually one or more 
can be eliminated from the model without reducing forecast performance. 

Stationarity and Invertibility 

The distribution theory underlying the use of the sample ACF and PACF as approxi- 
mations to those of the true data-generating process assumes that the {y,} sequence is 
stationary. Moreover, /-statistics and (9-statistics also presume that the data are sta- 
tionary. The estimated autoregressive coefficients should be consistent with this 
underlying assumption. Hence, we should be suspicious ofan AR( 1 ) model if the esti- 
mated value of a, is close to unity. For an ARMA(2, q) model, the characteristic roots 
of the estimated polynomial (1 - a^L - a 2 l 2 ) should lie outside of the unit circle. 

As discussed in greater detail in Appendix 2.1, the Box-Jenkins approach also 
necessitates that the model be invertible. Formally, {y,\ is invertible if it can be rep- 
resented by a finite-order or convergent autoregressive process. Invertibility is impor- 
tant because the use of the ACF and PACF implicitly assume that the { v,j sequence 
can be represented by an autoregressive model. As a demonstration, consider the sim- 
ple MA( 1 ) model: 

J7 -i 4 (2-45) 

so that if 1 8\ \ < 1, : ■ o wg 

y,/(i -ii\L) = s, 
or 

y, +/V,_i +/Jf.v,-2 + 4 y,.-3 + . . . = 6, (2,46) 

If I /J| ! < I. (2.46) can be estimated using the Box-Jenkins method. However, 
if | 4 ) > l, the {y,} sequence cannot be represented by a finite-order AR process; 
as such, it is not invertible. More generally, for an ARMA model to have a conver- 
gent AR representation, the roots of the polynomial (1 + i^L + /V-~ + ■■■ + T-/-' 7 ) 
must lie outside of the unit circle. Note that there is nothing improper about a non- 
invertible model. The {y,} sequence implied by y, = 6, 6,_| is stationary in that it 

has a constant time-invariant mean [Ey, = Ey t .. x = 0], a constant time-invariant vari- 
ance [var(v,) = var(y,_ v ) = a 2 ( 1 + 0\) + 2 <t 2 ], and the autocovariances p,| = -/Jj o- and 
all other p v = 0. The problem is that the technique does not allow for the estimation 
of such models. If 4 = I. (2.46) becomes 

>7 + y,-\ + J7- 2 + 37-3 +J7--4 + 

Clearly, the autocorrelations and partial autocorrelations between y, and y,_ v will 
never decay. 




tfht fts^ifiShiSLialWW .u 




78 CHAPTER 2 STATIONARY TIME-SERIES MODELS 

Goodness of Fit 

A good model will fit the data well. Obviously, R 2 and the average of the residual sum 
of squares are common goodness-of-fit measures in ordinary least squares. The prob- 
lem with these measures is that the fit necessarily improves as more parameters are 
included in the model. Parsimony suggests using the AIC and/or SBC as more appro- 
priate measures of the overall fit of the model. Also be cautious of estimates that fail 
to converge rapidly. Most software packages estimate the parameters of an ARMA 
model using a nonlinear search procedure. If the search fails to converge rapidly, it is 
possible that the estimated parameters are unstable. In such circumstances, adding an 
additional observation or two can greatly alter the estimates. 

The third stage of the Box-Jenkins methodology involves diagnostic checking. 
The standard practice is to plot the residuals to look for outliers and for evidence of 
periods in which the model does not fit the data well. If all plausible AR MA models 
show evidence of a poor fit during a reasonably long portion of the sample, it is wise 
to consider using intervention analysis, transfer function analysis, or any other of the 
multivariate estimation methods discussed in later chapters. If the variance of the 
residuals is increasing, a logarithmic transformation may be appropriate. 
Alternatively, you may wish to actually model any tendency of the variance to change 
using the ARCH techniques discussed in Chapter 3. 

It is particularly important that the residuals from an estimated model be seri- 
ally uncorrelated. Any evidence of serial correlation implies a systematic move- 
ment in the {y,} sequence that is not accounted for by the ARMA coefficients 
included in the model. Hence, any of the tentative models yielding nonrandom 
residuals should be eliminated from consideration. To check for correlation in the 
residuals, construct the ACF and the PACF of the residuals of the estimated model. 
You can then use (2.41) and (2.42) to determine whether any or all of the residual 
autocorrelations or partial autocorrelations are statistically significant. 4 Although 
there is no significance level that is deemed “most appropriate,” be wary of any 
mode! yielding (1) several residual correlations that are marginally significant and 
(2) a (7-statistic that is barely significant at the 10 percent level. In such circum- 
stances, it is usually possible to formulate a better performing model. If there are 
sufficient observations, fitting the same ARMA model to each of two subsamples 
can provide useful information concerning the assumption that the data-generating 
process is unchanging. In the AR(2) model that was estimated in the last section, 
the sample was split in half. In general, suppose you estimated an ARMA(/x q) 
model using a sample size of T observations. Denote the sum of the squared resid- 
uals as SSR. Divide the T observations into two subsamples with t m observations in 
the first and t„= T— i m observations in the second. Use each subsample to estimate 
the two models 

= "(l^) + «l(l)Y,-t + ■•• + «;,( 1 ).'7-/) + w + AC'Kz-t + ••- + fl< l 0)£l-q 
using t u .... t„, 

= "o(2) + tf|(2 )y M + ... + a p (2)y M , + £, + P\(2)e,_ x + ... + ij q ( 2)s,_ q 
using 






PROPERTIES OF FORECASTS 79 



Let the sum of the squared residuals from each model be SSR] and SSR 2 
respectively. To test the restriction that all coefficients are equal [i.e., a 0 (1) = a y (2) 
and tf|(l) = a ,(2) and ... a p ( 1 ) - a p (2) and 0 X {\) = /?,(2) and ... (3 q {\) = (3 q ( 2)], use 
an F-test and form 5 

F (SSR-SSR l -SSR 2 )tn (24?) 

(SSR\ + SSR 2 )^T-2n) 

where n = number of parameters estimated (n = p + q + 1 if an intercept is included 
and p + q otherwise) and the number of degrees of freedom are (n, T - 2 n). 

Intuitively, if the restriction is not binding (i.e., if the coefficients are equal), the 
sum SSR| + SSR 2 should equal the sum of the squared residuals from the entire sam- 
ple estimation. Hence, F should equal zero. The larger the calculated value of F, the 
more restrictive is the assumption that the coefficients are equal. 

Similarly, a model can be estimated over only a portion of the data set. The esti- 
mated model can then be used to forecast the known values of the series. The sum of 
the squared forecast errors is a useful way to compare the adequacy of alternative 
models. Those models with poor out-of-sample forecasts should be eliminated. Some 
of the details in constructing out-of-sample forecasts are discussed in the next section. 

9. P ROPERTIES OF FORECASTS 

Perhaps the most important use of an ARMA model is to forecast future values of 
the {y,} sequence. To simplify the following discussion, it is assumed that the actual 
data-generating process and the current and past realizations of the {£,} and {y ,} 
sequences are known to the researcher. First consider the forecasts from the AR(T) 
model i ! , = a () + a\y,_\ + £,. Updating one period, we obtain 

.u+i = a o + a \ y < + £ i + 1 

If you know the coefficients fly and fly you can forecast y (+t conditioned on the 
information available at period / as 

=fl 0 + fl ( y, (2.48) 

where E,y l+ j is a short-hand way to write the conditional expectation o fy f+ j given the 
information available at t. 

Formally, E,y l+j = E(y, +j \ y, ,y M , y,_ 2 , •••, £ t , 

In the same way, since y /+2 = a 0 + a i y,+ i + % 2 , the forecast ofy /+2 conditioned 
on the information available at period t is 

E,y ,+ 2 = flo + a \ E /yi+\ 

and using (2.48) 

£>/+ 2 = °o + «i(«o + a \yi) 

Thus, the forecast ofy (+l can be used to forecast y, +2 . The point is that forecasts 
can be constructed using forward iteration; the forecast of v /+ y can be used to forecast 
y,+j+\- Since y, +/+ | = a {) + ayy,^ + s l+J+ ,, it immediately follows that 

E ,y,+j+l= a Q + a iE,y,+j (2.49) 



fete ifrittttlijii 



80 CHAPTER 2 STATIONARYTIME-SERIES MODELS 



From (2.48) and (2.49) it should be clear that it is possible to obtain the entire 
sequence of y'-step-ahead forecasts by forward iteration. Consider 

E,y l+ j = ^q(1 + «| +... + a{ i ) + ajy, 

This equation, called the forecast function, expresses all of the /-step-ahead fore- 
casts as a function of the information set in period t. Unfortunately, the quality of the 
forecasts declines as we forecast further out into the future. Think of (2.49) as a first- 
order difference equation in the sequence. Since | a, | < I , the difference equa- 

tion is stable, and it is straightforward to find the particular solution to the difference 
equation. If we take the limit of E : y l+ j as j — > oo, we find that £,y /+/ — * a 0 /(l - 0 |). 
This result is really quite general: For any stationary ARMA model, the conditional 
forecast of y l+ : converges to the unconditional mean as j — > oo. 

Because the forecasts from an ARMA model will not be perfectly accurate, it is 
important to consider the properties of the forecast errors. Forecasting from time 
period t, we can define the /-step-ahead forecast error — e,(j) — as the difference 
between the realized value of v l+ j and the forecasted value 

e iil) — )’i+j ~ E,y ,+j 

Hence, the one-step-ahead forecast error is <?,(!) = y f+i - E,y l+ 1 = £, + ) (i.e., the 
“unforecastable” portion of y l+i given the information available in t). 

To find the two-step-ahead forecast error, we need to form efl) = y t+2 - E,y t+1 . 
Since v ;+2 = + «jv ;+ | + s /+2 and £,y , + 2 = a {) + d i E l y l+ 1 , it follows that 

e /(2) — O|0'/+i ~E l y l+ j) + s l+ j = t,+2 + 'fit/T-i 

You should take a few moments to demonstrate that for the AR(1) model, the 
y-step-ahead forecast error is given by 

e iU) = £)+,• + «|£',+y-i + "i 2 f,+y- 2 + a l 3£ nj-3 + ■•• + a t i- ] s l+ 1 (2.50) 

Since the mean of (2.50) is zero, the forecasts are unbiased estimates of each 
value y l+j . The proof is trivial. Since E,s l+j = £,£-, +/ -_ , = ... = E,£ l+X = 0, the condi- 
tional expectation of (2.50) is E,e,{j) = 0. Since the expected value of the forecast 
error is zero, the forecasts are unbiased. 

Although unbiased, the forecasts from an ARMA model are necessarily inaccu- 
rate. To find the variance of the forecast error, continue to assume that the elements 
of the {£,} sequence are independent with a variance equal to o 2 . Hence, from (2.50) 
the variance of the forecast error is 

Var[e,(y)] = a 2 [ \ + af + af + «, 6 + ... + afd ■>] (2.5 1 ) 

Thus, the one-step-ahead forecast error variance is a 2 , the two-step-ahead fore- 
cast error variance is tr 2 ( 1+zjj 2 ), and so forth. The essential point to note is that the 
variance of the forecast error is an increasing function of j. As such, you can have 
more confidence in short-term forecasts than in long-term forecasts. In the limit as 
j — > co, the forecast error variance converges to o 2 /( 1 - fl| 2 ); hence, the forecast 
error variance converges to the unconditional variance of the jvy) sequence. 

Moreover, assuming the {£,} sequence is normally distributed, you can place con- 
fidence intervals around the forecasts. The one-step-ahead forecast of v /+] is a 0 + a ,v, 



PROPERTIES OF FORECASTS 81 

and the forecast error is cfi. As such, the 95 percent confidence interval for the one- 
step-ahead forecast can be constructed as 

«0 + a x y t ± 1.96(7 

We can construct a confidence interval for the two-step-ahead forecast error in 
the same way. From (2.49), the two-step-ahead forecast is a 0 (l + cq )+«?>’, and (2.51) 
indicates that var[e,(2)] is cr 2 { 1 + af). Thus, the 95 percent confidence interval for the 
two-step-ahead forecast is 

a 0 (l + a { ) + afy t ±l.96a(l+af) l/2 



Higher-Order Models 



To generalize the discussion, it is possible to use the iterative technique to derive the 
forecasts for any ARMA(p, q) model. To keep the algebra simple, consider the 
ARMA(2, 1) model 

y, = o 0 + a, y,_ x + 0^-2 + e i + /W 1 (2.52) 

Updating one period yields 

y,+\ = a o + a \ y t + + £ i + 1 + 0 \ £ , 

If we continue to assume that (1) all coefficients are known; (2) all variables sub- 
scripted /, t- 1, t~2, ... are known at period t; and (3) E,£ i+ j = 0 for j > 0, the condi- 
tional expectation ofyy +1 is 

£/’’,+ ! = a o + a \ » + 1 + p\ £ t (2.53) 

Equation (2.53) is the one-step-ahead forecast of y /+1 . The onc-step-ahead fore- 
cast error is the difference between y, + j and E,y l+X so that e,(l) = r /+1 . To find the 
two-step-ahead forecast, update (2.52) by two periods 

> 5+2 = «0 + « !>’/+ 1 + " 2 + w +2 + P \ e l + 1 . 

The conditional expectation of y l+2 is 

E,)'i+ 2 = "o + a \E,y , + 1 + a 2 )’t (2-54) 

Equation (2.54) expresses the two-step-ahead forecast in terms of the one-step- 
ahead forecast and current value ofy,. Combining (2.53) and (2.54) yields 

E/F 1+2 = a 0 + «][«() + a\ y, + a 2 y ,_ , + /?,£,] + a 2 y, 

= a 0 (l + a | ) + [(j, 2 + a 2 ]y, + a x a 2 y,_ x + a x 0 \e, 

To find the two-step-ahead forecast error, subtract (2.54) from n, + ->. Thus, 

e,(2) = t/,0’,+1 - E,y ,+ ,) + s, +2 + /?,£,+ , (2.55) 

Since y, + | - E,y n . , is the one-step-ahead forecast error, we can write the forecast 
error as 



e/2) = (a | +A)%, +e t+2 

Finally, all /-step-ahead forecasts can be obtained from 

E 0’,+j = "o + a \E,y,+j-\ + a 2 E ,yi+j-2+i > 2 





(2.57) 






82 CHAPTER 2 STATIONARY TIME-SERIES MODELS 



Equation (2.57) suggests that the forecasts will satisfy a second-order difference 
equation. As long as the characteristic roots of (2.57) lie inside the unit circle, the fore- 
casts will converge to the unconditional mean: 1 - a, - a 2 ). We can use (2.57) to find 

the y'-step-ahead forecast errors. Since jy+y = a 0 +a ] y l+/ _ x + a 2 y l+ j_ 2 + e i+j + A^r+f-l’ t * ie 
/-step-ahead forecast error is 

<?,(/) = « iOVy -1 - E,y + a 2 (y l+j _ 2 - E iy , +j _ 2 ) + e l+j + fl x e t+j . 

= a | + a 2 e r0'-2) + £, +j + Af /+ y_, (2.58) 

In practice, you will not know the actual order of the ARMA process or the actual 
values of the coefficients of that process. Instead, to create out-of-sample forecasts, it 
is necessary to use the estimated coefficients from what you believe to be the most 
appropriate form of an ARMA model. The rule of thumb is that forecasts from an 
ARMA model should never be trusted if the model is estimated with fewer than 50 
observations. Suppose you have T observations of the {y t } sequence and choose to fit 
an ARMA(2, 1) model to the data. Let a hat or caret (i.e.: A ) over a parameter denote 
the estimated value of a parameter, and let {!,} denote the residuals of the estimated 
model. Hence, the estimated ARMA(2, I) model can be written as 

>’r = A) +"2T,_2 + £ t + A-/-I 

Given that the sample contains T observations, the out-of-sample forecasts are 
easily constructed. For example, you can use (2.53) to forecast the value of y Ti ] con- 
ditional on the T observations as 

E r y r+ \ = % + a x y T + ci 2 y T _ x + P x s T ( 2 . 59 ) 

Once you know the values of a 0 ,a t ,ii 2 , and /),, (2.59) can easily be constructed 
using the actual values y T , Vf_ x , and h (i.e., the last residual of your estimated 
model). Similarly, the forecast of y^ can be constructed as 

Et>’t+2 = % +^\E T y-r + 1 + a 2 y r 

where E T y T+x is the forecast from (2.59). 

Given these two forecasts, all subsequent forecasts can be obtained from the dif- 
ference equation 

E'l -yr+j = A) +«|£'/->'7'+y-i +«2 E 7'>’7-+y-2 f°r y > 2 

Unfortunately, it is much more difficult to construct confidence intervals for the 
forecast errors. Not only is it necessary to include the effects of the stochastic varia- 
tion in the future values of {y-f+j} , it is also necessary to incorporate the fact that the 
coefficients are estimated with error. 

Forecast Evaluation 

Now that you have estimated a series and have forecasted its future values, the obvi- 
ous question is, “How good are my forecasts?” Typically, there will be several plau- 
sible models that you can select to use for your forecasts. Do not be fooled into 
thinking that the one with the best fit is the one that will forecast the best. To make a 
simple point, suppose you wanted to forecast the future values of the ARMA(2, 1) 



•T'-"- ----- • --- • • — .; ■ .i > 9 1 > wyw i 

PROPERTIES OF FORECASTS 83 

process given by (2.52). If you could forecast the value of y T + x using (2.53), you 
would obtain the one-step-ahead forecast error 

^7-(l) == Tr+i ~ a o ~ a i }’t ~ a 2yr-\ ~~P\ £ t ~ £ t + i 

Since the forecast error is the pure unforecastable portion of yy+ x , no other 
ARMA model can provide you with superior forecasting performance. However, we 
need to estimate the parameters of the process, so our forecasts must be made using 
(2.59). As such, our forecast error will be 

e T - . v r+t ~(«o + A>r +®2>'7'-l + A £p) 

Clearly, the two forecast errors are not identical. When we forecast using (2.59), 
the coefficients (and the residuals) are estimated imprecisely. The forecasts made 
using the estimated model extrapolate this coefficient uncertainty into the future. 

Since coefficient uncertainty increases as the model becomes more complex, it could 
be that an estimated AR( 1 ) model forecasts the process given by (2.52) better than an 
estimated ARMA(2, 1) model. 

How do you know which one of several reasonable models has the best forecast- 
ing performance? One way to answer these questions is to put the alternative models 
to a head-to-head test. Since the future values of the series are unknown, you can hold 
back a portion of the observations from the estimation process. As such, you can esti- 
mate the alternative models over the shortened span of data and use these estimates 
to forecast the observations of the holdback period. You can then compare the prop- 
erties of the forecast errors from the two models. To take a simple example, suppose 
that {v,} contains a total of 1 50 observations and that you are unsure as to whether an 
AR(1) or an MA(1) model best captures the behavior of the series. 

One way to proceed is to use the first 100 observations to estimate both models 
and use each to forecast the value of .Viot • Since you know the actual value ofj/| () |, 
you can construct the forecast error obtained from the AR(1) and from the MA(1). 
These two forecast errors are precisely those that someone would have made if they 
had been making a one-step-ahead forecast in period 100. Now, re-estimate an AR(1) . 

and an MA(1) model using the first 101 observations. Although the estimated coeffi- 
cients will change somewhat, they are those that someone would have obtained in 
period 101. Use the two models to forecast the value of >”to2- Given that you know 
the actual value of y i0 2 , you can construct two more forecast errors. Since you know 
all values of the {y,} sequence through period 150, you can continue this process so 
as to obtain two series of one-step-ahead forecast errors, each containing 50 observa- 
tions. To keep the notation simple, let {/j,-} and {f 2j } denote the sequence of forecasts 
from the AR(1) and the MA(1), respectively. Similarly, {ej and f e 2i } denote the 
sequences of forecast errors from the AR(1) and the MA(1), respectively. If you 
understand the notation, it should be clear that f xx = TuooTioi ' s ^' e first forecast 
using theAR(l), cj] =j'ioi -f\\ is the first forecast error (where the first holdback 
observation isjqoi), and e 2 50 is the last forecast error from the MA(1). 

Obviously, it is desirable that the forecast errors have a mean near zero and a 
small variance. A regression-based method to assess the forecasts is to use the 50 
forecasts from the AR(I) to estimate an equation of the form 

Tl ()o+/ = a o + a \ f 1/ + 



L._JILliIjll i l i- I- — - 



84 CHAPTER 2 STATIONARY TIME-SERIES MODELS 



If the forecasts are unbiased, an F-test should allow you to impose the restriction 
a 0 = 0 and a\ = 1. Similarly, the residual series {v lr } should act as a white-noise 
process. It is a good idea to plot { Vj ,} against [yioo+;} t0 determine if there are peri- 
ods in which your forecasts are especially poor. Now repeat the process with the fore- 
casts from the MA(1). In particular, use the 50 forecasts from the MA(1) to estimate 

•VlOO+f = fy) + ^1/2/ + v 2 i ' = 1 50 

Again, if you use an F-test, you should not be able to reject the joint hypothesis 
= 0 and b \ = 1 . If the significance levels from the two F-tests are similar, noli might 
select the model with the smallest residual variance; that is, select the AR(1) if 
var(v|,) < var(v 2 ,). 

More generally, you might want to have a holdback period that differs from 50 
observations. With a very small sample, it may not be possible to hold back 50 obser- 
vations. Small samples are a problem since Ashley (1997) shows that very large sam- 
ples are often necessary to reveal a significant difference between the out-of-sample 
forecasting performances of similar models. Hence, you need to have enough obser- 
vations to have well-estimated coefficients for the in-sample period and enough oul- 
■' of-sample forecasts so that the test has good power. If you have a large sample, it is 

typical to hold back as much as 50 percent of the data set. Also, you might want to 
use y-step-ahead forecasts instead of one-step-ahead forecasts. For example, if you 
have quarterly data and want to forecast one year into the future, you can perform the 
; analysis using four-step-ahead forecasts. Nevertheless, once you have the two 

sequences of forecast errors, you can compare their properties. 

Instead of using a regression-based approach, many researchers would select the 
model with the smallest mean square prediction error (MSPE). If there are H obser- 
vations in the holdback periods, the MSPE for the AR( 1 ) can be calculated as 

1 11 

MSPE = — Ye 2 
H f-f 1- 
1=1 

Several methods have been proposed to determine whether one MSPE is statis- 
tically different from the other. If you put the larger of the two MSPEs in the nurner- 
■ v ator, a standard recommendation is to use the F-statistic 

/r = E‘'i- / E4 (2.60) 

, v The intuition is that the value of F will equal unity if the forecast errors from the 

two models are identical. A very large value of F implies that the forecast errors from 
the first model are substantially larger than those from the second. Under the null 
hypothesis of equal forecasting performance. (2.60) has a standard F-distribution 
with ( H . H) degrees of freedom if the following three assumptions hold: 

1. The forecast errors have zero mean and are normally distributed 

2. The forecast errors are serially uncorrelated 

3. The forecast errors are contemporaneously uncorrelated with each other 



PROPERTIES OF FORECASTS 85 



Although it is common practice to assume that the {f,} sequence is normally dis- 
tributed, it is not necessarily the case that the forecast errors are normally distributed 
with a mean value of zero. Similarly, the forecasts may be serially correlated; this is 
particularly true if you use multi-step-ahead forecasts. For example, equation (2.56) 
indicated that the two-step-ahead forecast error for_>> /+2 is 

e /(2) = (a | + /?,)£■,+, + s,. y 2 

and updating by one period yields the two-step-ahead forecast error for y l+2 
G+l( 2 ) = ( a l + / ? l)%2 + w+3 

It should he clear that the two forecast errors are correlated. In particular, 
E[e,(2)e l+l (2)] - (a, +/7 i )<t 2 

The point is that predicting y l+2 from the perspective of period / and predicting 
y l+ 3 from the perspective of period /+1 both contain an error due to the presence of 
e,+ 2 - However, for / > 1 , E[e,(2)e f+i (2)] = 0 since there are no overlapping forecasts. 
Hence, the autocorrelations of the two-step-ahead forecast errors cut to zero after lag 
I. You should be able to demonstrate the general result that /-step-ahead forecast 
errors act as an MA(/-1) process. 

Finally, the forecast errors from the two alternative models will usually be highly 
correlated with each other. For example, a negative realization of t /+1 will tend to 
cause the forecasts from both models to be too high. Unfortunately, the violation of 
any one of these assumptions means that the ratio of the MSPEs in (2.60) does not 

have an F-distribution. 



THE GRANGER-NEWBOLD TEST Granger and Newbold (1976) show how 
to overcome the problem of contemporaneously correlated forecast errors. Use the 
two sequences of forecast errors to form 

x, = <?|, + e 2 , and z, = e u - e 2 . 

Given that the first two assumptions above are valid, under the null hypothesis 
of equal forecast accuracy, x, and r ( should be un correlated 

Px- = Ex i 7 ~i = E (4 ~ 4 ) 



Model 1 has a larger MSPE if /\ z is positive and model 2 has a larger MSPE if 
is negative. Let r X! denote the sample correlation coefficient between {,v,} and 
jz r }. Granger and Newbold (1976) show that 






has a /-distribution with H - 1 degrees of freedom. Thus, if i\. is statistically differ- 



ent from zero, model 1 has a larger MSPE if r v , is positive and model 2 has a larger 



MSPE if i\. r is negative. 



THE DIEBOLD-IVJARIANO TEST There is a very large literature trying to 
extend the Granger-Newbold test so as to relax assumptions 1 and 2. Moreover, 
applied econometricians might be interested in measures of forecasting performance 




86 CHAPTER 2 STATIONARY TIME-SERIES MODELS 



other than the sum of squared errors. Indeed, it should be clear that using the sum of 
squared errors as a criterion makes sense only if the loss from making an incorrect 
forecast is quadratic. However, there are many other possibilities. For example, if 
your loss depends on the size of the forecast error, you should be concerned with the 
absolute values of the forecast errors. Alternatively, an options trader receives a pay- 
off of zero if the value of the underlying asset lies below the strike price but receives 
a one-dollar payoff for each dollar the asset price rises above the strike price. Diebold 
and Mariano (1995) have developed a test that relaxes assumptions 1 to 3 and allows 
for an objective function that is not quadratic. 

If we consider only one-step-ahead forecasts, we can eliminate the subscript /. As 
such, we can let the loss from a forecast error in period i be denoted by g(e,). In the 
typical case of mean squared errors, the loss is ef. Nevertheless, to allow the loss 
function to be general, we can write the differential loss in period i from using model 
1 versus model 2 as c/ ( - = - g(e 2l -)- The mean loss can be obtained as 

1 " 

/)] ( 2 - 62 ) 

H i—\ 

Under the null hypothesis of equal forecast accuracy, the value of <7 is zero. Since 
7 is the mean of the individual losses, under fairly weak conditions, the central limit 
theorem implies that <7 should have a normal distribution, Hence, it is not necessary 
to assume that the individual forecast errors are no rmally distributed. Thus, if we 
knew var(7), we could construct the ratio 7/»/var(7) and test the null hypothesis of 
equal forecast accuracy using a standard normal distribution. In practice, the imple- 
mentation of the test is complicated by the fact that we need to estimate var(7 ). 

If the {cl,} series is serially uncorrelated with a sample variance of the esti- 
mate of var(7) is simply 1). Since we use the estimated value of the variance, 

the expression has a /-distribution with H-\ degrees of freedom. 

There is a very large literature on the best way to estimate the standard deviation 
of 7 in the presence of serial correlation. Many of the technical details are not appro- 
priate here. Diebold and Mariano let denote the /-th autocovariance of the d, 
sequence. Suppose that the first q values of p; are different from zero. The variance 
of 7 can he approximated by var(7) = [pp + 2p| + ... + 2p^](//- I) -1 ; the standard 
deviation is the square root. As such, Harvey, Leybourne, and Newbold (1998) rec- 
ommend constructing the Diebold-Mariano (DM) statistic as 



DM = d I pj) + 2 y, + ... + 2y q ) /( H - 1) (2.63) 

Compare the sample value of (2.63) to a /-statistic with H- 1 degrees of freedom. 6 
It is also possible to use the method for y'-step-ahead forecasts. If {<?,,} and {e 2 ,} 
denote two sequences of y'-step-ahead forecasts, the DM statistic is 



DM — d I y'( P' () + 2 p, + ... + 2 lq ) /[ H + I — 2j + 1 j(J - I )] 

An example showing the appropriate use of the Granger Newbold and 
Diebold-Mariano tests is provided in the next section. 



A MODEL OFTHE PRODUCER PRICE INDEX 87 

10. A MODEL OFTHE PRODUCER PRICE INDEX 

The ARMA estimations performed in Section 7 were almost too straightforward. In 
practice, we rarely find a data series precisely conforming to a theoretical ACF or 
PACF. This section is intended to illustrate some of the ambiguities frequently 
encountered in the Box-Jenkins technique. These ambiguities may lead two equally 
skilled econometricians to estimate and forecast the same series using very different 
ARMA processes. Many view the necessity of relying on the researcher’s judgment 
and experience as a serious weakness of a procedure that is designed to be scientific. 
Yet, if you make reasonable choices, you will select models that come very close to 
mimicking the actual data-gencrating process. 

It is useful to illustrate the Box-Jenkins modeling procedure by estimating a 
quarterly model of the U.S. Producer Price Index (PP1). The data used in this section 
are the series labeled PP1 on the file QUARTERLY.XLS. Exercise 1 1 at the end of 
this chapter asks you to reproduce the results reported below. 

Pane! (a) of Figure 2.5 clearly reveals that there is little point in modeling the 
series as being stationary; there is a decidedly positive trend or drift throughout the 
period 1960(71 to 2002Q\. The first difference of the series seems to have a constant 
mean, although inspection of Panel (b) suggests that the variance is an increasing 
function of time. As shown in Panel (c), the first difference of the logarithm (denoted 
by A ippi) is the most likely candidate to be covariance stationary. Moreover, there is 
a strong economic reason to be interested in the logarithmic change since A Ippi, is a 
measure of inflation. However, the large volatility of the PPI accompanying the oil 
price shocks in the 1970s should make us somewhat wary of the assumption that the 
process is covariance stationary. At this point, some researchers would make addi- 
tional transformations intended to reduce the volatility exhibited in the 1970s. 
However, it seems reasonable to estimate a model of the {A Ippi,} sequence without 
any further transformations. As always, you should maintain a healthy skepticism of 
the accuracy of your model. 

Before reading on, you should examine the autocorrelation and partial autocorrela- 
tion functions of the {A Ippi,} sequence shown in Figure 2.6. Try to identify the tentative 
models that you would want to estimate. In making your decision, note the following; 

1 . The ACF and PACF converge to zero reasonably quickly. We do not want 
to overdifference the data and try to model the {A 2 /pp; ( } sequence. 

2. The theoretical ACF of a pure MA(q) process cuts off to zero at lag q and 
the theoretical ACF of an AR(I) model decays geometrically. Examination 
of Figure 2.6 suggests that neither of these specifications seems appropri- 
ate for the sample data. 

3. The ACF does not decay geometrically. The value of p\ is 0.603 and the 
values of />>, p i , and p 4 are 0.494, 0.45 I , and 0.446, respectively. Thus, the 
ACF is suggestive of an AR(2) process or a process with both autoregres- 
sive and moving average components. The PACF is such that (f > ( | = 0.604 
and cuts off to 0.203 abruptly (i.e., t/tji = 0.203). Overall, the PACF sug- 
gests that we should consider models such that p = 1 and p = 2. 



I.U I 




